|Home | About | Journals | Submit | Contact Us | Français|
The abundance of ribonucleotides in DNA remained undetected until recently because they are efficiently removed by the Ribonucleotides Excision Repair pathway, a process similar to Okazaki fragment processing after incision by RNase H2. All DNA polymerases incorporate ribonucleotides during DNA synthesis. How many, when and why they are incorporated has been the focus of intense work during recent years by many labs. In this review, we discuss recent advances in ribonucleotide incorporation by eukaryotic DNA polymerases that suggest an evolutionarily conserved role for ribonucleotides in DNA and review the data that indicate that removal of ribonucleotides plays an important role in maintaining genome stability.
Eukaryotic DNA replication proceeds at an enormous rate with remarkable accuracy, employing three major DNA polymerases. During embryonic development Drosophila melanogaster duplicates its genome in 8 minutes at 25 °C due in large part to the use of multiple origins of replication  Given the size of its genome, the rate of incorporation is about 0.5 mbp/second. As this rapid replication proceeds there are many opportunities for various types of errors. To maintain genome integrity, replicative polymerases are highly accurate in copying DNA. For DNA Pol α, the rate of incorporation of the incorrect base is about 10−4. DNA Pol δ and ε have the ability to proofread the last incorporated deoxynucleoside monophosphate (dNMP) and, if incorrect, remove it using the polymerases’ 3′-exonuclease activity. Proofreading increases their fidelity to a frequency of incorrect insertion of 10−6. Once the DNA exits the polymerase, the mismatch repair (MMR) system scans the newly synthesized strand for proper base pairing with the template, and repairs any mismatch, lowering the error rate to between 10−7 and 10−10.
Replication of damaged DNA requires specialized repair polymerases . Replicative polymerases have a very compact binding site that can only accommodate incoming bases that form proper hydrogen bonding with the template. This tight binding pocket cannot fit template DNA containing damaged or modified bases[4, 5]. To prevent DNA synthesis from stalling when encountering damage, replicative polymerases are replaced by translesion polymerases, which have a larger active site and therefore have lower base selectivity. Translesion polymerases vastly outnumber replicative polymerases, as each is specialized for polymerizing through different types of DNA damage. Together they allow completion of DNA replication and repair to maintain genome integrity but at the cost of generating mutations.
Until recently, examination of fidelity of DNA replication has mostly concentrated on how the DNA polymerases can insert the correct base, with limited concern for sugar discrimination. The abundant presence of ribonucleotides in chromosomal DNA was not detected until Kunkel’s lab, in an attempt to make mutants of the replicative polymerases to define their roles in replication, found a DNA polymerase ε (Pol ε) mutant that incorporates large numbers of ribonucleotides, and discovered that even the wild-type polymerases incorporate many ribonucleoside monophosphates (rNMPs) into DNA in vivo. These mutants led to the currently accepted division of labor model of chromosomal DNA replication, which places the DNA Pol ε as the major leading strand replicase and Pol α and Pol δ as the enzymes responsible for lagging strand Okazaki Fragment (OF) production.
The unexpected finding that in Saccharomyces cerevisiae replicative DNA polymerases incorporate large numbers of ribonucleotides during DNA synthesis[8, 9] has opened a whole new field of research that is improving our understanding of multiple processes and will likely yield even more surprises. We describe here recent studies on incorporation of rNMPs by the major DNA replicases which provide details on how the polymerases select against rNTPs, the frequencies with which each replicative polymerase includes rNMPs, how rNMPs are removed by ribonuclease H2 (RNase H2-see Glossary) and some of the benefits of having rNMPs transiently in DNA as well as the problems when rNMPs remain in double-stranded DNA when RNase H2 is defective.
DNA polymerases have a narrow active site suitable for dNTPs that would not easily accommodate the larger substituent at the 2′ position present in rNTPs (OH vs. H). Most DNA polymerases use a bulky amino acid residue called the “steric gate” located at the entrance of the active site to prevent rNMP incorporation[5, 10]. The steric gate amino acid residues are Glu for A family polymerases (see Glossary) and Tyr or Phe for members of the B, X, Y and RT families. Reducing the size of the side chain of the steric gate residue increases the tolerance for ribonucleotides in the active site, in some cases to the extent of being incorporated as well as deoxyribonucleotides. However, not all DNA polymerases use a bulky side chain to prevent rNMP incorporation, and crystallographic studies showed that X family members, pol β and pol λ, use the peptide backbone segment to exclude rNTPs[12-14].
The steric gate amino acid also plays a role in base discrimination and fidelity in copying the template. A smaller side chain not only improves incorporation of rNMPs but decreases affinity for the correct dNTP as well. Because of its crucial role in base and sugar selectivity, replacing the steric gate tyrosine (Y645) with alanine, in yeast Pol ε results in lethality. However, mutating the adjacent hydrophobic amino acid M644 has a strong effect on fidelity without impairing growth. Sugar selectivity can vary over a thirty fold range for Pol ε when M644 is replaced by Leu or Gly. The former increases selectivity while the Gly substitution decreases sugar discrimination. Therefore, Pol ε is one amino acid away from either being better or worse at ribonucleotide exclusion. Although Leu is instead present at the corresponding site on both Pol α and Pol δ, Met644 is highly conserved in Pol ε from different organisms suggesting a functional value of rNMP incorporation in the leading strand[6, 8]. Polε-M644G possesses a high mutation rate and this observation was exploited in the elegant experiments that led to the conclusion that Pol ε is responsible for leading strand replication while Pol δ synthesizes the lagging strand[7, 17].
Steric gate amino acids in the Y family of translesion DNA polymerases also control sugar selectivity and occasionally have unexpected effects on base fidelity. Mutating the steric gate F35 residue of S. cerevisiae Pol η increases base fidelity at the cost of decreasing sugar selectivity. The same is true for Y39 of Pol ι. For other translesion polymerases the steric gate residue is important for lesion bypass . Some DNA polymerases such as Terminal Deoxynucleotidyl Transferase (TdT)[20, 21] and DNA polymerase μare extraordinarily able to use rNTPs with little preference for dNTPs. When cellular concentrations of rNTPs/dNTPs (see following section) are used in vitro, as much as 80-90% of the product is composed of rNMPs[20, 21]. This may be a mechanism to accomplish DNA repair in phases of the cell cycle when the concentration of dNTPs is very low, a way to mark DNA for further repair, or a mechanism of programed mutagenesis in antibody diversification[20, 21].
The 3′-exonuclease proofreading domain of replicative DNA polymerases effectively eliminates mispaired bases but it is inefficient at removing ribonucleotides[22, 23] Once a ribonucleotide has been added to the growing chain, incorporation of the next nucleotide proceeds more slowly than extension from a dNMP, possibly facilitating removal by the 3′- exonuclease activity rather than elongation[24, 25]. The abilities of several eukaryotic polymerases to incorporate, extend and bypass ribonucleotides in DNA are summarized in Table 1[2, 6, 8, 12, 14, 19, 20, 22, 26-33].
The steric gate is severely challenged in vivo due to the high concentrations of rNTPs[6, 20]. The ratio of cellular dNTPs/rNTPs varies from one organism to another, from tissue to tissue, and with stage of the cell cycle, but always in all cases rNTPs are far more abundant than dNTPs (Figure 1) . Throughout the cell cycle the concentration of rNTPs is relatively constant, however the production of dNTPs increases during the S-phase, suggesting fewer ribonucleotides are incorporated during DNA replication and more during repair synthesis outside the S-phase. In terminally differentiated cells, such as human macrophages[35, 36], the dNTP concentrations are very low (Figure 1). To adjust to the large pool of rNTPs, specialized DNA polymerases repairing DNA in these cells are particularly adept at incorporating ribonucleotides. Low levels of dNTPs are maintained in macrophages by SAMHD1, a dNTP triphosphohydrolase, which is degraded upon infection with human immunodeficiency virus type2 (HIV-2) resulting in increased dNTP concentrations to favor viral replication.
In 1993 Eder and Walder proposed involvement of RNase H2 in removal of ribonucleotides in DNA when they reported RNase H2 incision at a single rNMP in an in vitro synthetic substrate, however the significance of this finding was not appreciated until the discovery of the abundance of rNMPs in DNA. A role for RNase H2 in removing rNMPs in DNA in S. cerevisiae was indicated subsequently by Rydberg and Game . Ribonucleotides are efficiently removed by the “Ribonucleotide Excision Repair” (RER) mechanism,which is initiated by RNase H2 through cleavage at the 5′-end of the ribonucleotide followed by strand displacement synthesis by Pol δ and flap removal by Fen1 and/or Dna2 and ligation by DNA ligase1 (Figure 2). Except for the first critical step carried out by RNase H2, the subsequent process is identical to OF maturation (Figure 2). RNases H were long thought to be responsible for removing OF primers. While this may occur in bacterial and mitochondrial DNA replication, in eukaryotic genome replication Fen1 and Dna2 are the major activities removing the flap structure containing the primer (Figure 2).
RER could occur concomitantly with or following DNA replication allowing rNMPs to serve as marks of newly synthesized DNA strands. rNMPs are more abundant in the leading strand where topological issues arise. RNase H2 incision at an rNMP could aid in releasing torsional constraints in DNA as well as creating sites to correct errors in replication. MMR provides additional means of ensuring insertion of the correct base but it requires a nick to initiate repair, a function that could be supplied by RNase H2 acting on rNMPs in leading strand DNA[43, 44] (Figure 3, Key Figure). Short lagging strand OFs contain many entry points for MMR (Figure 3, Key Figure). Replication and repair DNA synthesis requires PCNA to bring together the different processing factors and facilitate the reaction. RNase H2, like other components of the RER process, contains a PCNA-interacting peptide (PIP). Eukaryotic and a few archaeal RNases H2 have a PIP box suggesting they are recruited to replication and repair DNA complexes. In vitro RER requires PCNA but not the PIP box on RNase H2. S. cerevisiae and human RNase H2[45, 46] activity is not stimulated by PCNA while that of Archaeoglobus fulgidus RNase HII, a single chain enzyme, is increased in vitro by PCNA. This difference may reflect that the PIP box is present on the accessory RNase H2B subunit in the trimeric eukaryotic subunit which in the structure is quite remote from the catalytic center[47-49], but on the single chain A. fulgidus enzyme is much closer to the active center. It seems reasonable that RNase H2 initiates the RER process and its PIP interacts with PCNA to attract other RER components for subsequent steps of removal of rNMPs. Similarly, cleavage of the ribonucleotide by RNase H2 could attract PCNA and initiate the loading of the MMR system to scan for mismatches in the newly synthesized leading strand (Figure 3, Key Figure).
Although RER functions in association with DNA replication, in S. cerevisiae RNase H2 expression fluctuates with the cell cycle, peaking during the S and G2/M phase. In HeLa cells, western blot analysis indicates all three RNase H2 subunits are present throughout the cell cycle. In addition to initiating RER in S-phase, ribonucleotide processing is likely also to occur during DNA repair when the cells are in G2/M, perhaps to eliminate rNMPs that escaped cleavage after replication, or to remove ribonucleotides incorporated by repair DNA polymerases.
Yeasts deleted for RNase H2 are viable containing numerous rNMPs in genomic DNA[52-55], some of which are removed by alternative pathways that may induce genome instability, including mild replicative stress and check point activation, and are associated with higher mutation rates due to 2-5 base-pair deletions in short repetitive sequences. These phenotypes are linked to topoisomerase 1 (Top1) acting on rNMPs. Top1 resolves DNA supercoiling created during transcription or replication by making a nick in one DNA strand, which is religated after relieving the stress[56-59]. RER could also prevent supercoiling by incising at the rNMP, and perhaps that is one of the major roles of RNase H2. Absence of RNase H2 would create a need for Top1 to eliminate supercoiling incurred on leading strand DNA[60, 61]. Since lagging strand OFs are short, replication-induced supercoiling would not be an issue and may not attract Top1 (Figure 3, Key Figure). rNMP removal from lagging strand DNA could be performed by other enzymes leading to different patterns of mutations such as the single base pair deletion seen in the steric gate Pol η mutation, although single base pair deletions have also been observed in the leading strand in rnh201Δ-pol2G644M strains.
When Top1 cleaves at a ribonucleotide position, the 2′-OH group and 3′phosphate can form a 2′-3′ cyclic phosphate that is unable to be religated to the 5′-OH, creating a nick in the DNA (Figure 2 and Figure 3, Key Figure). The repair of the break can occur in either an error-free or mutagenic way. A second Top1 cleavage a few nucleotides upstream of the first releases a short oligonucleotide containing the 2′-3′ cyclic phosphate resulting in a short gap[57, 63]. The gap can be filled in employing Srs2 helicase, Exo1 nuclease and Pol δ[40, 57, 64] to correctly restore the DNA. Gap filling by a translesion polymerase would result in error-prone DNA synthesis. Moreover, if the gap is within a tandem repeat, the sequence can re-align bringing together the two ends allowing Top1-mediated ligation, creating the 2-5 bp deletions observed in association with Top1 ribonucleotide-processing (Figure 2 and Figure 3, Key Figure). The nicks introduced in DNA by Top1 and its faulty processing induces genomic instability, not the presence of ribonucleotides per se, because deleting TOP1 in the rnh201Δ strain eliminates these phenotypes while maintaining unprocessed ribonucleotides in DNA.
Yeast rnh201Δ pol2M644G mutant strains accumulate large numbers of rNMPs and are lethal when combined with a deletion of RNH1[33, 61]. However, because RNase H1 (see Glossary) cannot process single ribonucleotides in DNA, it cannot substitute for RNase H2 in the RER pathway. Consequently, the detrimental effect of combining the pol2M644G rnh201Δ and rnh1Δ mutations in the same strain may be due to abundant rNMPs in DNA together with accumulation of R-loops (see Glossary). R-loops can stall replication fork progression generating DNA damage (Figure 4). Transcription induces supercoiling in front of and behind RNA polymerases providing substrates for Top1 and promoting R-loops. S. cerevisiae rnh1Δrnh201Δ strains are viable but require Top1 activity to prevent R-loop formation during transcription, most notably in rRNA genes[67, 68]. Highly transcribed genes in a wildtype RNH1 RER-defective strain also exhibit 2-5 bp deletions on the non-transcribed strand(NTS), regardless of direction of replication. In this instance, even overexpression of RNase H1 fails to eliminate the deletions. High frequency of transcription-induced supercoiling may provide Top1 access to attack rNMPs in the NTS. These findings all point to the close associations between RNases H and topology.
Recently, the ability to map ribonucleotides in DNA has yielded valuable information about the complexities of the eukaryotic DNA replication process. Four different groups using distinct, but related techniques have located the position of rNMPs in budding[52-54] and fission yeast. All used deep sequencing and various procedures for marking the position of the rNMPs, which have been incorporated by the same DNA Pol mutants used to determine the division of labor at the replication fork. These studies mapped replication origins (some known and some new) where initiations occur in a very narrow region while converging replication forks result in broad termination zones. They found non-random distribution of rNMPs with many base pairs not having any rNMPs (no reads) while on the same scale some sites had more than 1000 reads, suggesting sequence context effects. More rNMPs were present in the leading strand confirming that wild-type Pol ε and its steric gate M644G mutant incorporate rNMPs at higher rates that their Pol δ counterparts. More surprisingly, they established that Pol α has substantial contribution to genomic DNA after OF maturation. It was thought that a majority of DNA synthesized by Pol α is removed by displacement synthesis by Pol δ and Fen1 cleavage of the flap. Persistent residual rNMPs could reflect the failure to remove Pol α synthesized DNA in some of the large numbers of OFs (~105 OFs formed during replication of haploid S. cerevisiae and ~10 million OFs synthesized in diploid human DNA replication).
The replication complex plows through nucleosomes and proteins bound to DNA, such as transcription factors, ahead of the replication fork to open the DNA strands and allow synthesis. Nucleosomes rapidly form after leading strand synthesis, but the discontinuous nature of the lagging strand presents obstacles for nucleosome assembly. It is unknown if nucleosome assembly and ligation of the newly synthesized lagging strand DNA are coupled. OFs and DNA wrapped around a nucleosome plus the internucleosome spacer are similar in length. It has previously been reported that in the absence of DNA Lig1 the ends of two adjacent OFs are present at the dyad of the nucleosome. Removal of the Primase-Pol α RNA-DNA (PriαR-D) product depends on Fen1 cleavage following displacement by Pol δ (Figure 2 black rectangle). Were there a delay in Pol δ synthesis or Fen1 removal of the PriαR-D region, nucleosome assembly could occur on DNA containing the PriαR-D sequence (Figure 4- blue oval), affecting complete removal of the RNA primer and leaving multiple rNMPs, which would have scored as an rNMP when mapping ribonucleotides in DNA. Interestingly, two papers described slight increases of rNMPs at the nucleosome dyad[52, 53].
Mating type switching in two distantly related Schizosaccahromyces (pombeandjaponicus) is characterized by the presence of a di-ribonucleotide in DNA at specific sites of the mating type locus. Di-ribonucleotide imprinting requires Pol α, suggesting the ribonucleotides are derived from the PriαR-D of OFs. Imprinting of the mating-type switching is eliminated when a region just prior to (or after) the site of imprinting is deleted. The length of the region is important, not the sequence, indicating the DNA may function as a spacer to position the site to be imprinted. Replication pausing is another important factor for imprinting. As suggested above, pausing could prevent the displacement and maturation of the OF with rNMPs at its 5′-end, and may allow nucleosome formation on a region of DNA that contains a portion of the RNA primer. It is unclear how the imprint arises and how it is unrecognized by RNase H2.
Since the seventies, it has been known that mouse and human mitochondrial DNA (mtDNA) contain ribonucleotides that are sensitive to alkali treatment and RNase H2. These rNMPs are incorporated by mitochondrial DNA Pol γ during DNA replication, although like other polymerases, DNA Pol γ has a functional steric gate (Table1). The ribonucleotides that survive in the mtDNA are resistant to the RNase H1 present in mitochondria, indicating they are most likely present as mono- di- or tri-ribonucleotides[51, 76]. RER is not active in mitochondria due to the absence of RNase H2 in this organelle. Mitochondrial functions are not affected by the presence of these dispersed rNMPs. However, in the absence of RNase H1 failure to remove the RNA primer used to initiate mtDNA replication has catastrophic effects. It will be of great interest to determine the frequency and location of rNMPs in mtDNA and assess any role they might have in mitochondrial related pathologies. S. cerevisiae mtDNA is present in strains deleted for RNH1 [50, 77] and RNH201[51, 52, 54, 77]. Yeast mtDNA is larger and more complex than metazoan mtDNA. Its replication mechanism is likely quite distinct as indicated by maintenance of mtDNA in the absence of RNases H1 and H2. Clausen et al  interpreted their sequences of mtDNA to be compatible with a circular mtDNA genome but is also consistent with a recombination-driven replication.
Unlike in yeasts where the presence of ribonucleotides in DNA is not detrimental per se, it has been suggested that the embryonic lethality at E9.5 of mice lacking RNase H2 is due to abundant rNMPs in DNA[51, 79] (estimated to be 1. 3 million or approximately one rNMP per 7,500 nucleotides) This conclusion is important for understanding patients with AGS and SLE who have mutations in RNase H2. AGS is a neuroinflammatory autoimmune disease resulting from mutations in seven genes encoding nucleotide-processing enzymes, with more than 50% of the patients having biallelic mutations in one of the three subunits of RNase H2. A recent study reporting a higher frequency of rare variance in RNase H2 genes in a cohort of SLE patients compared with normal controls established a genetic association between RNase H2 and SLE. The hallmark of defects in RNase H2, rNMPs in DNA, was suggested to be responsible for the observed activation of autoimmunity. While rNMPs in DNA may be the culprit, RNA/DNA hybrid accumulation might contribute to the observed phenotypes since RNase H2 represents greater than 90% of the hybrid degradative activity in eukaryotic cells. Somewhat surprisingly, RNA/DNA hybrids were found at higher than normal levels in cell lines from AGS patients with mutations in RNASEH2A, RNASEH2B, SAMHD1 and TREX1, suggesting hybrids as possible causes of AGS. The debate between consequences of rNMPs in DNA and failure to resolve RNA/DNA hybrids continues.. Two recent studies in S. cerevisiae examining recombination in diploid strains reported genome instability in the absence of RNase H2 but reached opposite conclusions as to the contributions of the two RNase H2 activities to the observed instability [59, 84]. Clearly more work lies ahead that will be aided by a RNase H2-RED (Ribonucleotide Excision Defective) mutant that lacks RER but can hydrolyze RNA/DNA hybrids.
All DNA polymerases incorporate ribonucleotides due to imperfect exclusion of rNTPs at the catalytic center and high ratio rNTPs/dNTPs in all cells. DNA Pol ε incorporates more rNMPs as it synthesizes the leading strand than do Pol δ and Pol α during lagging strand formation. Evidence suggests that ribonucleotides in leading strand are important for subsequent steps in DNA metabolism. Their processing, initiated by RNase H2 nicking at the rNMPs, may release torsional stress and provide an entry point to the MMR system, contributing to DNA integrity. Over one million rNMPs are present in DNA of mouse embryonic cells with no RNase H2 resulting in death at E9.5. Patients with AGS and SLE autoimmune syndromes have mutations that decrease RNase H2 activity and accumulate rNMPs. RNase H2 provides the initial step in removal of rNMPs, but also processes RNA/DNA hybrids, products of RNA transcription and reverse transcription. Still remains to be determined the contribution of each activity to the phenotypes observed in mouse and human. The RNase H2 RED mutant would be valuable to ascribe the role of rNMP processing in eukaryotes.
We thank Andrei Chabes (Umeå University Medical Biochemistry and Biophysics SE-901 87 Umeå, SWEDEN) for providing comments on concentrations of dNTPs and rNTPS in S. cerevisiae, Baek Kim (Department of Pediatrics Emory School of Medicine Atlanta, GA) for comments on NTP concentrations in mammalian cells, David Clark (NICHD NIH) for discussion of nucleosomes and Ariadne Cerritelli for help with figures. This work was supported by the Intramural Program of NIH.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.