|Home | About | Journals | Submit | Contact Us | Français|
Elucidating the sources of genetic variation within microsatellite alleles has important implications for understanding the etiology of human diseases. Mismatch repair is a well described pathway for the suppression of microsatellite instability. However, the cellular polymerases responsible for generating microsatellite errors have not been fully described. We address this gap in knowledge by measuring the fidelity of recombinant yeast polymerase δ (Pol δ) and ε (Pol ε) holoenzymes during synthesis of a [GT/CA] microsatellite. The in vitro HSV-tk forward assay was used to measure DNA polymerase errors generated during gap-filling of complementary GT10 and CA10 -containing substrates and ~90 nucleotides of HSV-tk coding sequence surrounding the microsatellites. The observed mutant frequencies within the microsatellites were four to 30-fold higher than the observed mutant frequencies within the coding sequence. More specifically, the rate of Pol δ and Pol ε misalignment-based insertion/deletion errors within the microsatellites was ~1000-fold higher than the rate of insertion/deletion errors within the HSV-tk gene. Although the most common microsatellite error was the deletion of a single repeat unit, ~ 20% of errors were deletions of two or more units for both polymerases. The differences in fidelity for wild type enzymes and their exonuclease-deficient derivatives were ~two-fold for unit-based microsatellite insertion/deletion errors. Interestingly, the exonucleases preferentially removed potentially stabilizing interruption errors within the microsatellites. Since Pol δ and Pol ε perform not only the bulk of DNA replication in eukaryotic cells but also are implicated in performing DNA synthesis associated with repair and recombination, these results indicate that microsatellite errors may be introduced into the genome during multiple DNA metabolic pathways.
Microsatellite sequences are repetitive sequences of one to six base pairs per repeat unit that are non-randomly distributed throughout all eukaryotic genomes. Dinucleotide microsatellites are highly abundant; specifically, GT/CA dinucleotides comprise 19% of all microsatellites in the human genome . A defining attribute of microsatellites is their high frequency of both expansion and deletion mutation, which results in allele length variation and a high degree of genetic polymorphism among individuals in populations . The transition from low mutability, characteristic of short tandem repeat sequences, to high mutability, characteristic of microsatellites, occurs once a threshold number of repeat units in the allele has been reached . Allele length polymorphisms at common mono- and dinucleotide microsatellites are implicated as genetic risk factors in several human diseases , including cystic fibrosis (CFTR gene) [5,6] and breast cancer (EGFR gene) [7,8]. Individuals with compromised postreplication mismatch repair (MMR), a pathway that repairs insertion/deletion (indel) mutations exceptionally well, are predisposed to the development of cancer (for recent review, see ). Tumors arising in these patients display widespread mononucleotide and dinucleotide microsatellite instability, and mutations within microsatellites associated with critical target genes are believed to play a causative role in the evolution of MMR-defective tumors [10,11]. Undoubtedly, elucidating the sources of genetic variation within common microsatellite alleles has important implications for understanding the etiology of human diseases.
Classically, indel mutations are proposed to arise during DNA replication by a slippage mechanism [12,13]. The slippage event creates a misaligned intermediate containing one or more extrahelical nucleotides, and depending on the stability of the misaligned stretch of DNA, the unpaired bases will be either inserted into, or deleted from, the DNA strand during the following round of replication [12,14]. Strong structural evidence supporting the slippage hypothesis emerged in 2006 when DNA polymerase λ was crystallized with a single-base deletion intermediate containing an unpaired nucleotide in the template strand [15,16]. The structure displayed the extrahelical nucleotide, the correct base pair at the primer terminus, and the geometry of a polymerase active site that was compatible with catalysis [15,16]. Experimental data obtained using purified DNA polymerases, bacteriophage, bacteria, yeast and human cell model systems are consistent with strand slippage models, in that all show mutation rates within tandemly repeated sequences or microsatellites that increase with an increase in the length of the repeat [3,17-23]. Consequently, interruptions in a repeated array dramatically reduce the mutation frequency [24,25].
DNA polymerases δ and ε are the only nuclear DNA polymerases in eukaryotic cells with an intrinsic 3′→ 5′ exonuclease (proofreading) activity. Thus, these DNA polymerases can be considered the “front-line” DNA repair mechanism for maintaining genome stability . The critical importance of polymerase proofreading in the maintenance of genome stability and avoidance of disease has been demonstrated elegantly in exonuclease-deficient mouse models [27,28]. In yeast model systems, the 3′→ 5′ exonuclease activity was shown to contribute to the removal of indel mutations within short repeated sequences [29,30]. However, the contribution of polymerase proofreading activity in suppressing indel mutations diminished with increasing length of a mononucleotide tandem repeat . Similarly, a previous in vitro study using the T7 DNA polymerase demonstrated that proofreading efficiency is diminished with an increase in the repeat tract length .
The fidelity of replicative eukaryotic DNA polymerases within microsatellites has not been investigated previously, despite the prevalence and potential of such sequences to modify disease risk. Here, we examined mutagenesis associated with in vitro DNA synthesis by the holoenzyme forms of yeast polymerase δ (Pol δ) and polymerase ε (Polε), using the established HSV-tk microsatellite assay . The Pol δ holoenzyme is comprised of three subunits: Pol3, Pol31, and Pol32. Pol3 is the catalytic subunit containing both the polymerase and exonuclease active sites. The Pol ε holoenzyme is comprised of four subunits: Pol2, Dpb2, Dpb3, and Dpb4. Pol2 is the catalytic subunit, and similar to Pol3 of Pol δ, contains both the polymerase and exonuclease activities of the enzyme. We compared the frequency of indel errors created by each polymerase within a [GT/CA]10 microsatellite, and compared this to the frequency of indel errors within the HSV-tk gene coding region. To quantitatively assess the contribution of proofreading to microsatellite stability, we also conducted synthesis reactions using holoenzyme preparations that are exonuclease deficient. The results of this study emphasize the vital role played by cellular MMR in yeast for the suppression of DNA sequence variation within genomic microsatellites.
Overexpression of proteins was performed in Saccharomyces cerevisiae strain BJ2168 (MATa,ura3-52,trp1-289,leu2-3,112,prb1-1122,prc1-407,pep4-3). For Polδ holoenzyme overexpression, BJ2168 was transformed with pBL341 and either pBL335 or pBL335-DV and growth and induction was as described previously . Cells (60g of packed cells resuspended in 8ml of dH2O) frozen in liquid nitrogen in popcorn form were ground using a Spex Sample Prep 6870 freezer mill, which lysed the cells by magnetic motion. Purification continued as previously described in . For the Pol ε holoenzyme, pJL1 or pJL1-exo and pJL6 were transformed, grown, and expressed as previously reported . The four subunit Pol ε was purified in the same manner as previously described . Activity of the purified enzyme was determined by a specific activity assay using labeled activated calf thymus DNA. In order to establish the contribution of exonucleolytic proofreading to microsatellite sequence stability, we also purified holoenzyme forms with catalytic subunits that are deficient in proofreading activity. The mutations introduced for the exonuclease deficient polymerase forms (pol δ = D520V in domain ExoIII; pol ε = double mutation of D289A and E291A) have been described previously, and have been shown to completely abolish in vitro proofreading activity [35-37].
An in vitro assay for the quantitation of DNA polymerase errors within microsatellite sequences has been described previously . In this assay, an artificial microsatellite sequence, [GT/CA]9, was inserted in-frame between bases 111 and 112 of the HSV-tk target, in the sequence context [GT (insert) TCTC] on the sense strand. Bases flanking the insert are considered part of the microsatellite region; therefore, the entire microsatellite motif is [GT/CA]10. For the current study, a StuI restriction site at HSV-tk position 180 was created and subcloned into the aforementioned vectors as described , allowing our coding region mutational target to be shortened from ~200bp to ~90 bp. Linear DNA fragments bearing a functional cat gene and ssDNA bearing a nonfunctional cat gene were prepared, and used to construct gapped duplex (GD) molecules as described [32,38]. The pSStu2 GD substrate contains the [GT]10 microsatellite and surrounding HSV-tk coding sequence within the single-stranded gap, which serves as the template for DNA polymerase reaction. The pSAStu2 GD substrate contains the complementary [CA]10 microsatellite and HSV-tk sequences within the template sequence (Figure 1).
In vitro gap-filling reactions contained 0.125 pmol of GD substrate and 250 μM dNTPs in 100 μL final volume. The minimal amount of polymerase required to achieve complete gap-filling was determined empirically by titration for each polymerase and GD preparation. Pol δ reactions contained 40 mM Tris pH 7.8, 8 mM MgOAc, and 1.25 - 3.1 pmol of Pol δ WT or Pol δ Exo-. Reactions were incubated at 30°C for 1 hour. Pol ε reactions contained 50 mM Tris pH 7.5, 8 mM MgCl2, 2 mM DTT, 100 μg/mL BSA, 10% glycerol, and 0.125 - 1 pmol Pol ε WT or 0.625 - 1.9 pmol Pol ε Exo-. Reactions were incubated at 30°C for 30 min. All reactions were terminated with 15 mM EDTA and the buffers exchanged into TE buffer. Because the determination of complete gap-filling was crucial for accurate mutational analyses, two different, yet complementary, methods were used (Figure 1B and 1C). First, 50 fmol of product were analyzed by separation through an 0.6% agarose gel for ~18 hours, along with GD and nicked (completely filled) standards. This standard analysis will detect the presence of starting (i.e., unfilled) GD substrate, as well as intermediate reaction products migrating between the standards. However, the migration patterns of complete gap-filling products and intermediate gap-filling products can be ambiguous due to the low resolution of the agarose gel. To more definitively differentiate complete gap-filling products from intermediate products, a denaturing polyacrylamide gel analysis was developed (Figure 1A). In this approach, 50 fmol of polymerase reaction products or 50 fmol of starting GD substrate were digested with 2 units of MluI and StuI at 37°C for 1 hour. Reaction products were exchanged into TE buffer and 5′-end labeled in an exchange reaction using 10 μCi of [γ-32P]ATP and 10 units of T4 kinase at 37°C for 10 min. Reactions were terminated with 15 mM EDTA, purified through a G25 Sephadex column, added to an equal volume of stop dye (99% v/v formamide, 5 mM EDTA, 0.1% (w/v) xylene cyanol, and 0.1% (w/v) bromophenol blue), and the products separated through a 6% denaturing polyacrylamide gel. Polymerase synthesis of ~five nucleotides from the GD 3′OH will create one complete double-stranded restriction enzyme site, whereas the starting GD substrates encode only partial sites (Figure 1A). Thus, little to no MluI-StuI digestion products are expected from the unfilled GD substrate, whereas 111nt and 115 nt strands are expected from complete gap-filling products (Figure 1B,C). The method is strictly qualitative in design, because the efficiency of the T4 kinase reaction differs between strands (due to different sequences surrounding the 5′PO4) and because the recovery of radiolabelled DNA fragments from the G25 columns is variable. Never-the-less, the analysis can clearly identify incomplete gap-filling DNA products as bands less than 111 or 115 nt (Figure 1C).
To select for HSV-tk mutations, an aliquot of DNA from complete reactions was used to transform recA13, upp, tdk E. coli strain FT334 by electroporation. The background frequency of the gap-filling in vitro assay was determined by using the unfilled (starting) GD substrates for electroporation. In all cases, electroporated bacteria were plated on VBA selective media as described [32,39]. The presence of 50 μg/mL chloramphenicol (Cm) selects for progeny of the polymerase-synthesized strand and the presence of 40 μM 5-fluoro-2′-deoxyuridine (FUdR) selects for bacteria bearing HSV-tk mutant plasmids. The observed HSV-tk mutant frequency (MF) is defined as the number of FUdRRCmR colonies divided by the number of CmR colonies.
Independent FUdR-resistant mutants were isolated as described [32,39] from at least two separate polymerase reactions for each enzyme and template combination, as well as from unfilled GD DNA. The mutant plasmids were sequenced to identify the location of the mutation within the HSV-tk target. Sequencing reactions were performed using the PE-Applied Biosystems automated sequencing reagents and the TK309-MSI sequencing primer (5′CCGCCAGTAAGTCAT). Analysis was completed on a Perkin-Elmer ABI Model 3100 Sequencer. Seqman a sequence alignment program created by Lasergene (licensed to the National Institute of Environmental Health Sciences) was utilized to analyze the data. In some cases, 5′ strand displacement synthesis or 3′ exonuclease digestion occurred, resulting in polymerase errors outside of the Mlu I-Stu I gap target. Such mutations were excluded from the data analyses. Fisher's exact test (two-sided) was used to determine the statistical significance of differences in the proportions of specific classes of mutations, using the absolute numbers of observed mutants within each class (provided in Tables).
To quantitate polymerase errors, the observed HSV-tk MF was first corrected to reflect only those mutational events arising within the MluI–StuI target region. This proportion ranged from 90-95% for the various Pol δ reactions, and 59-92% for the Pol ε reactions. The resulting frequency is referred to as the “overall” HSV-tk MF in Table 2. For example, the observed HSV-tk mutant frequency for Pol δ WT using the pSStu2 substrate was 43 × 10-4 (Table 1). Of the 96 mutant plasmids sequenced, 86 events were within the MluI-StuI target; thus, the overall Pol δ WT HSV-tk MF is (86/96)(43 × 10-4), or 38 × 10-4 (Table 2). Next, the frequency of specific types of mutational events was calculated by multiplying the overall MF by the proportion of sequence changes observed. For the Pol δ WT example, 76 events were within the GT10 microsatellite target; thus, the microsatellite HSV-tk MF is (76/86)(38 × 10-4), or 34 × 10-4 (Table 2). Finally, the estimated polymerase mutant frequency (Pol MFest) for specific errors was calculated by subtracting the unfilled GD substrate MF (also referred to as the background MF) from the HSV-tk MF. For the Pol δ WT example, the Pol MFest for errors within the GT10 microsatellite is estimated to be (34 × 10-4) - (2.6 × 10-4) = 31 × 10-4 (Figure 2A).
DNA polymerases δ and ε are widely believed to perform the bulk of DNA replication elongation in eukaryotic cells, with one current model suggesting that Pol δ replicates primarily the lagging-strand template  and Pol ε replicates primarily the leading-strand template . In addition, these polymerases have been implicated in the DNA synthesis associated with several DNA repair pathways [42,43]. Both Pol δ and Polε are highly accurate during in vitro synthesis of the lacZ target sequence, and introduce limited base substitution and indel errors [37,44]. This high fidelity is due, in part, to the intrinsic proofreading exonuclease activities . Although previous in vitro studies have shown that the 3′→ 5′ exonuclease activity of replicative polymerases can remove indel errors within short (≤ 5 bases) mononucleotide repeats, the efficiency of proofreading microsatellites with larger repeat sizes has not been determined directly.
Gap-filling DNA polymerization reactions were performed using complementary DNA substrates, containing either an in-frame [GT]10 (pSStu2) or [CA]10 (pSAStu2) microsatellite sequence surrounded by HSV-tk gene coding sequence. The gapped duplex (GD) molecules contain a single-stranded gap of 91 - 95bp of HSV-tk sequence and 20bp of microsatellite sequence. The HSV-tk coding region sequence serves as a monitor for polymerase indel fidelity, as it contains 23 repeated mono- and dinucleotide sequences of two to three units each. (The DNA sequences of the target regions are provided in Supplemental Figure 1.) In these constructs, the target sequence contains few detectable sites of base substitution mutations, and is biased towards the detection of strand misalignment-based errors. Complementary agarose gel and denaturing polyacrylamide gel electrophoresis analyses were used to ensure that all enzymes completely filled both GD substrates under the stated reaction conditions (Figure 1). The Pol δ WT reactions generated DNA products with mutant frequencies that were two- to nine-fold higher than the corresponding unfilled GD background reactions (i.e., “no polymerase” control), while the Pol ε WT mutant frequencies were increased three- to four-fold over unfilled GD background (Table 1). As expected, the HSV-tk mutant frequencies measured for the exonuclease-deficient Pol δ and Pol ε reactions were higher than those measured for the WT holoenzymes (Table 1).
The HSV-tk forward mutation assay detects any mutation that inactivates the protein, both within the artificial microsatellite and within the surrounding protein-coding sequences present in the target. Therefore, independent mutant colonies were collected and sequenced to identify the location and type of mutation. A complete listing of the errors observed within the mutational target by all four enzymes and two substrates can be provided upon request. The majority of both Pol δ and Pol ε errors produced during the gap-filling reactions were within the [GT]10 or [CA]10 microsatellite sequences (Table 2).
The HSV-tk MF measured for the pSStu2 and pSAStu2 unfilled GD substrates differ by ~3-fold (Table 1). Therefore, in order to directly compare polymerase mediated errors generated on the complementary DNA strands, we estimated the polymerase mutant frequency (Pol MFest) within each region by subtracting the unfilled GD background frequency from the corresponding HSV-tk frequencies for each polymerase reaction. The resulting Pol MFest for Pol δ WT within the [GT]10 and [CA]10 microsatellites is eight to 30-fold higher than the corresponding Pol MFest within the HSV-tk coding region (Figure 2A). The Pol MFest for Pol ε within the microsatellite sequences is four to six-fold higher than the corresponding coding region (Figure 2B). In addition, a strand bias for Pol δ WT microsatellite errors was observed, in that the Pol MFest within the [GT]10 allele is three-fold higher than the Pol MFest within the complementary [CA]10 allele (Figure 2A). A smaller but opposite trend was observed for Pol ε WT, in that the Pol MFest within the CA allele is 1.8-fold higher than that for the GT allele (Figure 2B). Mutational strand biases have been observed previously for polymerases α and β within microsatellite sequences .
We observed previously that DNA polymerases in vitro produce two classes of mutations within a microsatellite sequence: unit-based indel errors and interruption errors [32,38,45]. Although Pol δ WT and Pol ε WT also produced both classes of errors within the [GT]10 and [CA]10 microsatellite sequences, the majority of errors (82-100%) were unit-based indel mutations (Table 3). Both polymerases are highly biased towards the production of deletion errors, compared to insertion errors, within the microsatellite sequences. Moreover, approximately 20% of the indel deletions were of two or more repeat units (Table 3). The largest Pol δ WT deletion observed was the removal of four repeat units (eight nucleotides) in the [CA]10 substrate, while the largest Pol ε WT deletion was five repeat units within the [GT]10 substrate (data not shown).
Both unit-based microsatellite indels and traditional frameshift indel errors arise by a strand misaligment mechanism. To directly compare the likelihood of the two types of polymerase misaligment errors, we calculated the Pol MFest per site for indel errors within the 23 repeated sequence DNA motifs contained in our HSV-tk target sequence (Supplemental Figure 1). The replicative holoenzymes produce microsatellite misalignment errors at a rate that is nearly three orders of magnitude (~900-fold) greater than indel errors within the HSV-tk sequence (Table 4).
Microsatellite interruptions have been observed in previous polymerase studies [32,38]. Both Pol δ WT and Pol ε WT produce more interruptions on the [GT]10 template compared to the [CA]10 template (Table 3). This difference is statistically significant for Polδ (p ≤ 0.0001, Fisher's exact test). In addition, Pol δ WT produces more interruptions (18% of errors) than does Pol ε WT (2% of errors) within the [GT]10 sequence (p=0.014, Fisher's exact test) (Table 3).
To investigate the extent to which the exonuclease activities of Pol δ and Polε contribute to microsatellite sequence fidelity, we determined the polymerase mutant frequencies of exonuclease-deficient forms. The HSV-tk mutant frequencies within the GT10 and CA10 microsatellite regions observed for the exonuclease-deficient enzymes generally differed from those observed for WT enzymes by less than two-fold (Table 2). When examining specifically the frequency of indel errors within the microsatellites, we observed Pol MFest differences of 1.2-fold between the WT and exonuclease-deficient Pol ε forms (Figure 3). A larger two- to four-fold difference in the coding region Pol MFest was observed between WT and exonuclease-deficient Pol ε forms (Table 2). For the Pol δ forms, the differences between WT and exonuclease-deficient forms were 1.4-fold and 2.6-fold for the [GT]10 and [CA]10 templates, respectively (Figure 3). A three- to seven-fold difference in HSV-tk mutant frequencies within the coding regions was observed between the WT and exonuclease-deficient forms of Pol δ during synthesis of the two templates (Table 2). The differences in proofreading efficiency on the complementary pSStu2 and pSAStu2 templates likely reflect the influence of DNA sequence context on polymerase error production and/or exonuclease activity (i.e., hotspot differences on the complementary strands; see Figure S1). These magnitudes of Pol δ and Pol ε exonuclease enhancement of fidelity at the HSV-tk coding sequence are consistent with previous studies using the same enzymes and a lacZ indel reversion target sequence . The data in Figure 3 demonstrate that the exonuclease activity does not contribute strongly to the fidelity of the holoenzymes during dinucleotide microsatellite DNA synthesis.
Interestingly, the proofreading exonuclease activity of both polymerases preferentially removes interruption errors within the microsatellites (Table 3). Exonuclease deficient Pol ε produced significantly more interruptions during [GT]10 DNA synthesis, compared to the wild type enzyme (p<0.0001, Fisher's exact test). Similarly, exonuclease deficient Pol δ produced significantly more interruptions during [CA]10 DNA synthesis, compared to the wild type enzyme (p=0.01, Fisher's exact test). A strand bias for interruptions was also apparent, with more interruptions occurring within the [GT] than the [CA] template sequence for both exonuclease-deficient polymerases (p=0.0043, Pol δ; p=0.0002, Pol ε; Fisher's exact test). These mutational biases may reflect, in part, the inherent error specificites of both enzymes previously measured within the lacZ mutational target [37,44].
This study is the first to investigate the in vitro fidelity of replicative, eukaryotic holoenzymes within a tandem repeat array that can be defined as a microsatellite allele . Using the well-defined in vitro HSV-tk polymerase fidelity assay [3,32,38,45], high polymerase mutant frequencies (~10-3) were measured during microsatellite DNA synthesis by the wild-type holoenzymes, which contrasts with the low mutant frequencies (~10-4) measured during synthesis of a non-repetitive sequence (Figure 2). Error correction by the respective proofreading exonucleases contributed little to the overall fidelity of the polymerases within the microsatellite sequences (Figure 3). These data support the previous suggestion that microsatellites are “at-risk” sequences for variation within the eukaryotic genome .
We investigated whether the high fidelity of replicative polymerases previously measured for base substitution and indel errors within genes would be mirrored in longer, tandem repeats (e.g., microsatellites). As clearly shown here, high fidelity is not maintained for either Pol δ or Pol ε within the [GT/CA]10 microsatellite. Directly comparing misalignment-based errors, the fidelity of Pol δ WT and Pol ε WT for unit-based indel errors created during synthesis of the microsatellite alleles is ~1000-fold lower than the fidelity for indel errors created during HSV-tk gene synthesis (Table 4). The results for Pol ε were unexpected, as this polymerase is among the most accurate of DNA polymerases in vitro for single base indel errors . The addition of the replication accessory proteins RPA, RFC and PCNA to yeast Pol δ in vitro reactions was previously shown not to alter the error rate for one nucleotide deletions . Likewise, the addition of RFC and PCNA to human Pol δ reactions does not alter the frequency of errors within the GT10 microsatellite1. [GT/CA] mutagenesis during genomic DNA replication will be the summation of polymerase errors on the [GT] strand plus errors on the [CA] strand. Assuming that Pol δ and Pol ε replicate complementary strands and that the polymerases display similar fidelity in vivo, we estimate conservatively that the combined Pol δ + Pol ε error frequency during [GT+CA] synthesis is 2.5-5 × 10-3 (Table 5). This frequency corresponds to an expected mutation rate of 1 mutant per 200 – 400 [GT/CA] alleles per round of DNA replication.
Previously, the error rates of Pol δ WT and Pol ε WT during synthesis of seven consecutive thymines were determined using an in vitro lacZ gap-filling assay, similar to the HSV-tk assay . Using those data, we calculated the error rate per repeat unit for Pol δ and Pol ε at the mononucleotide T allele versus the GT dinucleotide allele (Table 6). This analysis suggests that the fidelity of Pol δ and Pol ε for mononucleotide microsatellite DNA synthesis may be even lower, as the polymerases created errors more often within the [T]7 allele than within the [GT]10 allele. However, the effects of repeat unit size (mono-, di- tetra) on replicative polymerase error rates must be tested directly in future studies using the same mutational assay.
We also examined the importance of the intrinsic 3′ to 5′ exonuclease activity to microsatellite sequence variation. The frequency of unit-based microsatellite indel errors produced by exonuclease-deficient enzymes differed by less than 1.4-fold from that measured for the corresponding exonuclease-proficient enzymes, with the exception of Pol δ and the CA allele (Figure 3). Interestingly, the exonucleases tend to preferentially remove interruption errors within the microsatellite alleles (Table 3). Such interruptions, if maintained over several rounds of DNA replication, would be expected to stabilize genomic microsatellites by breaking an allele into two, shorter tandem arrays that mutate at lower frequencies than the parent allele. Thus, the normally protective proofreading function may act instead to promote genome instability within microsatellite sequences. These in vitro results are consistent with previous in vivo studies showing that the proofreading activities of both Pol δ and Pol ε are inefficient at recognizing and repairing indel mutations in [GT/CA] repeats . The low proofreading efficiency within microsatellites may be due to stabilization of the misaligned intermediate over the entire length of the repeated array, resulting in minimal distortion of the DNA substrate. Structurally, the mere physical distance over which to incorporate a bulge of unpaired bases within a long repetitive sequence may result in the physical separation of the polymerase active site from the misalignment, such that the bulge is rendered invisible to the proofreading function.
Evolutionary models of microsatellite mutation have been developed for use in estimates of genetic distances between populations (reviewed in ). The widely used stepwise mutation models for microsatellite mutation assume that insertions or deletions of a single unit occur at fixed rates as a function of allele length. A full 20% of the indel mutations created by the wild-type replicative polymerases within the dinucleotide alleles were deletions of two or more repeat units, as measured in the HSV-tk assay (Table 3). These observations suggest that Pol δ and Pol ε may be able to accommodate large loops of extrahelical nucleotides during extension synthesis. Alternatively, the multi-unit deletions may result from the simultaneous formation of multiple, single unit bulges within the long [GT/CA]10 repeated tract. Further investigation is required to elucidate the mechanistic underpinnings of the multi-unit repeat deletions. Regardless of mechanism, the incidence of multi-unit deletions within the [GT/CA] microsatellite should be taken into account in future mathematical models of microsatellite evolution.
Finally, small strand biases in replicative polymerase fidelity during [GT]10 versus [CA]10 microsatellite synthesis were observed (Figure 2). This bias is intriguing, as a current model for genome replication is that Pol δ synthesizes the lagging strand template  and Pol ε synthesizes the leading strand template . The in vitro data presented here predict a strand bias in microsatellite mutability may exist in vivo, depending on the position of the GT versus the CA dinucleotide sequence relative to the origin of DNA replication (Table 5). We plan to further investigate a microsatellite strand bias for DNA polymerase errors in vivo, using a yeast reporter cassette adjacent to a well-defined origin of replication in a yeast strain with specialized Pol δ and Pol ε mutator alleles with wild type catalytic activity and strong mutational specificity [40,41]. In yeast, differences in the specificity and efficiency of MMR correction have been observed among A10, T10, C10, and G10 alleles [47,48]. The results of this study support the hypothesis that these biases reflect, in part, the error specificities of the replicative polymerases that initiate the mutation.
The high replicative DNA polymerase fidelity associated with synthesis of gene target sequences is not maintained during microsatellite DNA synthesis. Both Pol δ and Pol ε holoenzymes produce a high frequency of indel errors within the [GT/CA] microsatellite sequence. While the majority of indel errors are deletion of one repeat unit, a significant proportion (~20%) are of multiple units. The proofreading exonuclease activities of polymerases δ and ε contribute little to the repair of unit-based indel errors within the microsatellite. The B family DNA polymerases δ and ε are widely believed to be responsible for the bulk of DNA replication elongation synthesis within eukaryotic genomes, but also have been implicated in the DNA synthesis associated with several DNA repair pathways and recombination . The high holoenzyme error rates within microsatellites, coupled with the low efficiency of proofreading correction of polymerase indel errors, is expected to place a high burden of premutational intermediates upon the mismatch repair system in vivo.
The DNA sequencing Research Support group at the NIEHS completed the sequencing reactions. We thank Ms. Noelle Strubczweski for determining the unfilled gapped substrate background mutant frequency. We also thank E. Johansson and E.B. Lundström for providing purified Pol ε exonuclease deficient enzyme for in vitro reactions. This work was supported by grant R01 CA100060 from the National Institutes of Health to K.A.E and by project Z01 ES065070 to T.A.K from the Department of Intramural Research of the National Institutes of Health. The funding sources had no role of any type in the study.
1S.E. Hile, M. Y.W.T. Lee, and K.A. Eckert, manuscript in preparation
6. Conflict of Interest statement: The authors declare that there are no conflicts of interest.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.