Cells possess many DNA polymerases (DNAPs); e.g., human, yeast (
S. cerevisae) and
E. coli have at least fifteen, eight and five, respectively.
1 - 3 The cellular role of some DNAPs can be understood by noting that DNA is constantly being damaged by radiation and chemicals, and most adducts/lesions that are not removed by DNA repair block replicative DNA polymerases. To avoid such lethal blockage, cells possess lesion-bypass DNAPs,
1 - 12 which conduct translesion DNA synthesis (TLS). Many lesion-bypass DNAPs are in the Y-family,
1 - 12 where humans have three (DNAPs η, ι and κ), yeast have one (DNAP η) and
E. coli has two (DNAPs IV and V).
Y-Family DNAPs have a conserved ~350aa core, which includes the polymerase active site [representative references
13 - 22]. As with all DNA polymerases, Y-Family members resemble a right-hand with thumb, palm and fingers domains, although their “stubby” fingers and thumb result in more solvent accessible surface around the template/dNTP-binding pocket,
7 presumably to accommodate the bypass of bulky and/or deforming DNA adducts/lesions, which protrude into these open spaces. Y-Family DNAPs grip DNA with an additional domain,
7, 13, 17, 18 usually called the “little finger”. Steps in the mechanism of Y-Family DNAPs have been proposed for both protein structural changes based on a series of X-ray structures
22-24 and for chemical catalysis based on theoretical studies.
25Our work has focused on benzo[a]pyrene (B[a]P), which is a well-studied DNA damaging agent that is a potent mutagen/carcinogen and an example of a polycyclic aromatic hydrocarbon (PAH), a class of ubiquitous environmental substances produced by incomplete combustion.
26 - 27 PAHs in general and B[a]P in particular induce the kinds of mutations thought to be relevant to carcinogenesis and may be important in human cancer (representative reference,
28). B[a]P mutational spectra were established with the biologically relevant metabolite (+)-
anti-B[a]PDE in
E. coli,
29 yeast
30, 31 and mammalian (CHO) cells [reference
32 and references therein]. Mutagenesis has also been studied with [+ta]-B[a]P-N
2-dG (+BP, ), the major adduct of (+)-
anti-B[a]PDE, and G->T mutations predominate in most cases [reference
33 and references therein].
Based on genetic studies, DNAPs IV and V are both involved in the non-mutagenic translesion synthesis (TLS) pathway with +BP in
E. coli.
34 - 40 Why might two DNAPs required? Certain lesions need one DNAP for insertion (dCTP in this case) and a second for extension, which (following adduct-G:C formation) involves dNTP insertion opposite the next 5′-template base (and additional bases).
41 - 42 presents the current understanding of insertion and extension for both +BP and -BP as supported by the following observations. Purified DNAP IV principally inserted dCTP (>99%) opposite +BP and -BP in a 5′-C
GA sequence,
34 and evidence in cells suggests DNAP IV does dCTP insertion opposite +BP
35 - 39 and -BP [37], as well as other N
2- dG adducts.
38, 39 In contrast, purified DNAP V almost exclusively inserted dATP opposite +BP in a 5′-C
GA sequence (>99%),
34 and genetic findings show that DNAP V must be responsible for dATP insertion opposite +BP in the G->T pathway in
E. coli.
40 Collectively, these findings suggest it is DNAP IV, and not DNAP V, that does dCTP insertion opposite +BP, implying that the role of DNAP V is likely to be for extension in the non-mutagenic pathway. In the non-mutagenic pathway in
E. coli with -BP, only DNAP IV is required,
37 implying that DNAP IV does insertion and extension with -BP. The notion that DNAP IV does extension with -BP while DNAP V does extension with +BP is consistent with kinetic findings with purified proteins: DNAP IV is significantly worse than DNAP V at the extension step in the case of +BP compared to -BP.
34 The fact that G->T mutations in a 5′-TGT sequence depend on DNAP V, but not DNAPs II or IV,
40 suggest that DNAP V does both dATP insertion and extension in the mutagenic pathway. In vitro evidence also suggests that DNAP V does dATP insertion opposite -BP,
34 and we have preliminary evidence that DNAP V is involved in the G->T pathway in
E. coli with -BP, though involvement by DNAPs II and IV has not yet been investigated. Though random mutagenesis studies with [+
anti]-B[a]PDE suggest that most G->T mutations with B[a]P-adducts require SOS-induction, implying involvement of a lesion-bypass DNAP, a minor non-SOS-inducible G->T pathway does exist [discussed in reference
37] and was studied in a 5′-G
GA sequence, in which dATP insertion opposite +BP by DNAP III was proposed.
35The study of
E. coli’s Y-Family DNAPs may provide insights about Y-Family DNAPs in general. Human DNAP κ was originally discovered because its sequence closely resembles
E. coli DNAP IV,
43 - 45 and dNTP insertion opposite a variety of adducts/lesions, including +BP, is remarkably similar for the DNAP IV/κ pair (), suggesting they are functional orthologs [discussed in reference
46]. This notion was substantiated when the identical mutation in a conserved residue in the active site of DNAP IV and DNAP κ (the “steric gate”, which excludes rNTPs) had a similar effect on lesion bypass vs. normal replication both
in vitro and in cells.
38 E. coli DNAP V and human DNAP η are also functional orthologs, based on their similarity of dNTP insertion opposite a variety of adducts/lesions (
47).
| Table 1Dominant dNTP insertions opposite various DNA adducts/lesions by E. coli DNAPs IV and V and human DNAPs κ and η1 |
DNAPs IV and κ have been shown to accurately bypass a variety of N
2-dG adducts,
34 - 39 including those formed from endogenous trioses,
39 which may be the main cellular rationale for the genesis of the IV/κ-class. A case has been made that the main cellular rationale for the DNAP V/η-class is TLS of UV-induced CPDs (discussed in reference
46).
There must be structural reasons why the insertion preference opposite adducts/lesions is different for the DNAP IV/κ-class vs. the DNAP V/η-class (
47). The key differences, though, are not obvious given that the orthologs UmuC(V) and hDNAP η are only 20% identical by alignment, and not more identical than (e.g.) the non-ortholog pair UmuC(V) and hDNAP κ, which are 21% identical.
47 Understanding the structural basis for mechanistic differences is further complicated by the fact that no X-ray structures exist for UmuC (the polymerase subunit of DNAP V), DNAP IV or hDNAP η, though X-ray structures exist for hDNAP κ with DNA.
21 The need for protein structural information to guide our investigation of protein functional differences induced us to build models of UmuC(V), DNAP IV and hDNAP η taking a homology modeling approach.
46 - 48 Though analysis of X-ray structures, modeled structures and sequence alignment suggest that Y-Family DNAPs lend themselves to accurate homology modeling,
46 - 48 we wished to evaluate whether our models are likely to be correct, especially in the vicinity of the active site. Herein, we have investigated aspects of the structure of UmuC(V), which we believe is unlikely to have an X-ray structure in the near term given its complexity and how difficult it is to purify (e.g., see reference
49). We have taken a classic structure-activity approach: mutating the protein and inferring aspects of structure from changes in activity.
One residue of interest in Y-Family DNAPs is the “roof-amino acid”, which is a positionally conserved residue that lies above the nucleobase of the dNTP, as seen in the active site of Dpo4
13 - 16, yeast DNAP η,
18, 19 human DNAP ι
20 and hDNAP κ
21 (purple residues in ). The roof-aa might influence dNTP insertion. shows the sequence alignment in the vicinity of the roof-aa of all Y-Family DNAPs for which X-ray or modeled structures exist,
13 - 21, 46 - 48 along with sequences of DNAPs κ and η from several other species. Based on the alignment in , I38 is the roof-aa. shows that I38 (purple) in our UmuC model is positioned similarly to the roof-aa in other Y-Family DNAPs for which X-ray structures exist (). shows a top view of our UmuC model with I38 (purple) stacked on top of the nucleobase of the dNTP (white). Isoleucine is the roof-aa for the functional ortholog hDNAP η, as well as for DNAP η from other species ().
48 Serine is the roof-aa in our DNAP IV model,
47 and serine is the roof-aa for its functional ortholog hDNAP κ, as well as for DNAP κ from most other species.
48 (The roof-aa can also be cysteine or alanine in DNAP κ.) This correlation reinforces the notion that the roof-aa probably plays an important conserved role in protein function, and a role in dNTP insertion is a sensible possibility. Experiments described herein show that I38-mutants affect DNAP V bypass efficiency in a pattern that can be rationalized if I38 is truly the roof-aa, and that the β-branched structure of isoleucine is important to optimize DNAP V activity.
Based on the alignment in and our UmuC(V) model,
46-48 the roof-aa is in a loop that extends from aa29 to aa39. As depicted in , the top half of this loop faces into the major groove, which is extensively solvent exposed, while the bottom half faces into the minor groove, where Y-Family DNAPs have a hole/cleft in the protein surface. We call this hole/cleft the “chimney,”
46-48 while others call it the “gap.”
17 The size of this hole/cleft/gap/chimney is small both in our model of UmuC(V)
48 and in the X-ray structure of its functional ortholog scDNAP η,
18, 19 while the hole/cleft/gap/chimney is large in our model of DNAP IV
48 and in the X-ray structure of its ortholog hDNAP κ.
21 (Figures 2, 5A, 5B, 8B and 8D in reference
48 show these differences in opening size.) We recently discussed a likely mechanism by which the identity of several amino acids in this loop controls the size of the opening of the hole/cleft/gap/chimney, and how the large opening for the IV/κ-class might favor dCTP insertion opposite +BP, while the small opening for the V/η-class might lead to dATP insertion.
48 To assess whether we have correctly aligned the aa29-39 loop in our UmuC(V) model, we investigated the following. In X-ray structures of other Y-Family DNAPs, this loop is anchored by a cluster of three amino acids, which correspond to V29/I38/A39 in our UmuC(V) model. Experiments herein support the notion that V29/I38/A39 are indeed clustered, which suggests that we have correctly aligned the amino acids in this loop and bolsters the likelihood that UmuC(V) has a small opening for its the hole/cleft/gap/chimney.