|Home | About | Journals | Submit | Contact Us | Français|
Kaposi's sarcoma-associated herpesvirus is an emerging pathogen whose mechanism of replication is poorly understood. PF-8, the presumed processivity factor of Kaposi's sarcoma-associated herpesvirus DNA polymerase, acts in combination with the catalytic subunit, Pol-8, to synthesize viral DNA. We have solved the crystal structure of residues 1 to 304 of PF-8 at a resolution of 2.8 Å. This structure reveals that each monomer of PF-8 shares a fold common to processivity factors. Like human cytomegalovirus UL44, PF-8 forms a head-to-head dimer in the form of a C clamp, with its concave face containing a number of basic residues that are predicted to be important for DNA binding. However, there are several differences with related proteins, especially in loops that extend from each monomer into the center of the C clamp and in the loops that connect the two subdomains of each protein, which may be important for determining PF-8's mode of binding to DNA and to Pol-8. Using the crystal structures of PF-8, the herpes simplex virus catalytic subunit, and RB69 bacteriophage DNA polymerase in complex with DNA and initial experiments testing the effects of inhibition of PF-8-stimulated DNA synthesis by peptides derived from Pol-8, we suggest a model for how PF-8 might form a ternary complex with Pol-8 and DNA. The structure and the model suggest interesting similarities and differences in how PF-8 functions relative to structurally similar proteins.
Most if not all organisms with DNA genomes have mechanisms to ensure processive DNA synthesis. In bacteria, archaea, and eukaryotes, DNA polymerase subunits include a catalytic subunit and a processivity factor, often referred to as a “sliding clamp.” In these organisms, a clamp loader protein is required to assemble the processivity factor onto the DNA (27, 37). The bacterial sliding (beta) clamp is made up of homodimers of a subunit that comprises three structurally similar subdomains (26), whereas archaeal and eukaryotic proliferating cell nuclear antigen (PCNA) is composed of homotrimers that comprise two structurally similar subdomains (27, 37). For both of these clamps, the monomers assemble head-to-tail to form a closed homodimeric or homotrimeric ring, respectively, around the DNA. In these organisms, a clamp loader protein is required to efficiently load the clamp onto DNA, using an ATP-dependent process. Once loaded on DNA, the processivity factor is capable of binding directly to the DNA polymerase, conferring extended strand synthesis without falling off of the template (50).
Herpesviruses encode their own DNA polymerases. However, unlike bacteria, archaea, and eukaryotes, herpesviruses do not encode clamp loaders to assemble their processivity factors onto the DNA. Yet, the accessory subunits of the herpesvirus DNA polymerases still associate with DNA with nanomolar affinity to enable long-chain DNA synthesis (9, 16, 23, 25, 29, 35, 44, 46, 53, 56). Human herpesviruses are divided into three classes, namely, the alpha-, beta-, and gammaherpesviruses, based on homologies found in their genomic organization as well as in protein sequences and function (45). Crystal structures have been determined for the processivity factor UL42 from the alphaherpesvirus herpes simplex virus type 1 (HSV-1) and for UL44 from the betaherpesvirus human cytomegalovirus (HCMV) (2, 3, 58). Despite having little if any sequence homology with processivity factors outside of their herpesvirus subfamily, these structures all share the “processivity fold” originally seen in the structure of the bacterial beta clamp (26). Interestingly, some of these processivity factors have a different quaternary structure. PCNA forms a head-to-tail trimeric ring (18, 27), HSV-1 UL42 is a monomer (10, 14, 16, 46, 58) equivalent to one-third of the PCNA complex, and HCMV UL44 is a head-to-head dimer in the form of a C-shaped clamp (2, 3, 9).
Both HSV-1 UL42 and HCMV UL44 have a basic face that has been shown to be important for interacting with DNA (25, 35). In the case of dimeric HCMV UL44, the basic surface of each monomer faces inward, toward the center of the C clamp, and includes a basic loop, called the “gap loop,” that is thought to wrap around DNA (24). Recently the crystal structure of the bacterial beta clamp was determined in complex with DNA (15). In that structure, DNA was found to be located in the central pore of the clamp. Amino acid residues that interacted with DNA were in positions structurally homologous to those found on the positively charged faces of UL42 and UL44.
UL42 and UL44 each also has a surface, facing away from the DNA binding face, that is important for interacting with the catalytic subunit of the viral DNA polymerase. Indeed, both of these proteins have been crystallized in complex with C-terminal peptides from their respective catalytic subunits, HSV-1 UL30 and HCMV UL54 (2, 58). Together with biochemical and mutational analyses, these crystal structures indicated that, although the details of the interaction are different, the catalytic subunit of the polymerase binds to a region including and in close proximity to a long loop that connects the N- and C-terminal subdomains, called the interdomain connector loop (32-34). The corresponding region of PCNA is also important for polymerase attachment and mediates the interactions of PCNA with many other cellular proteins (40). Both UL54 and UL30 were shown to attach to their respective subunits, UL44 and UL42, by way of their extreme C termini. The C-terminal residues responsible for this interaction correspond to amino acids that are not detectably conserved, either by sequence or by structure, among herpesvirus catalytic subunits. The HSV-1 UL30-UL42 interaction involves a groove to one side of the UL42 connector loop, with hydrophilic interactions being critical (58). The HCMV UL54-UL44 interaction involves a crevice near the UL44 connector loop, and hydrophobic interactions are crucial (2, 32, 33). Moreover, the HCMV UL44 crevice is on the opposite side of the connector loop with respect to the HSV-1 UL42 groove.
Kaposi's sarcoma-associated herpesvirus (KSHV), a gammaherpesvirus, encodes a viral DNA polymerase catalytic subunit, Pol-8, and an accessory subunit, PF-8 (4, 7, 8, 29, 48, 57). PF-8 can bind to Pol-8 directly and specifically (8, 29) and is required for long-chain DNA synthesis in vitro (29). Similarly to UL44, PF-8 forms dimers in solution and when bound to DNA (9). Although it is likely that UL44 and PF-8 are the processivity factors for HCMV and KSHV, respectively, rigorous experiments demonstrating this have not been performed. However, for the sake of brevity and clarity, we will refer to these proteins as processivity factors.
Here we present the crystal structure of PF-8 and show that PF-8 forms a head-to-head homodimer akin to UL44 but lacking the long gap loops which are thought to wrap around DNA. This suggests that PF-8 binds DNA differently than does UL44 or UL42. Because Pol-8 appears to lack a long, flexible C-terminal tail with a length comparable to those of other herpesvirus Pols, we expect the mode of binding of the catalytic subunit to be different as well. Based on structural data, information from homologs, and initial biochemical results, we were able to identify possible sites for interactions with DNA and Pol-8 and to propose a model for the simultaneous interaction of all three components of the complex. Further, the availability of crystal structures for all three herpesvirus classes provides new insights into comparative structure, function, and evolution.
The construct MBP-PF-8ΔC305-396 consists of the sequence for PF-8ΔC305-396 in a pMal vector containing a PreScission protease cleavage site between the maltose binding protein (MBP) tag and the protein (58). Starting with that construct, PF-8ΔC305-396(L60M, L167M, L258M), containing three Leu-to-Met mutations, was generated using the QuikChange mutagenesis protocol (Stratagene). The presence of these specific mutations was confirmed by sequencing at the Dana Farber Cancer Institute, Molecular Biology Core Facility. The primers used to make these constructs are listed in Fig. S1 at https://coen.med.harvard.edu.
Escherichia coli BL21(DE3)pLysS (Stratagene) containing the MBP-PF-8ΔC305-396 plasmid was grown in LB containing 0.1 μg/ml ampicillin. After reaching an optical density at 600 nm of between 0.6 and 0.8, the cultures were cooled without shaking at 16°C for 1 h. After cooling, 0.3 mM isopropyl-beta-d-thiogalactopyranoside (RPI), was added and the bacteria were shaken at 16°C for 24 h. The bacteria were then pelleted and stored at −80°C until use. Later, the pellets were resuspended in lysis buffer containing 50 mM Tris (pH 7.5), 500 mM NaCl, 20% glycerol, 5 mM EDTA, 2 mM dithiothreitol (DTT), 0.5% Triton X-100, and Roche Complete protease inhibitors and lysed by sonication. The bacterial lysates were then centrifuged at 15,000 rpm (Beckman JA-20 rotor) for 1 h, and the supernatant was applied to an amylose column (NEB) that was equilibrated with buffer A (50 mM Tris [pH 7.5], 500 mM NaCl, 20% glycerol, 5 mM EDTA, 2 mM DTT). After the protein was loaded, the column was washed with 20 column volumes of buffer A, and finally MBP-PF-8ΔC305-396 was eluted from the column with buffer A plus 10 mM maltose. The MBP tag was cleaved from PF-8ΔC305-396 by adding PreScission protease (GE Healthcare) at a final concentration of approximately 1:100 and incubating overnight at 4°C. The next day, the protein was diluted twofold with buffer A without NaCl. The protein was then loaded onto a 5-ml HiTrap heparin HP column (GE Healthcare) that had been equilibrated with buffer B (50 mM Tris [pH 7.5], 250 mM NaCl, 10% glycerol, 1 mM EDTA, 2 mM DTT). The column was then washed with buffer B, and PF-8ΔC305-396 was eluted with a linear gradient from buffer B to buffer C (50 mM Tris [pH 7.5], 1 M NaCl, 10% glycerol, 1 mM EDTA, 2 mM DTT). Fractions containing a majority of PF-8ΔC305-396 were concentrated using an Amicon Ultra-15 centrifugal filter unit with a 10-kDa molecular mass cutoff (Millipore) and loaded onto a Superdex 200 gel filtration column (GE Healthcare) that was run with buffer D (50 mM Tris [pH 7.5], 500 mM NaCl, 10% glycerol, 1 mM EDTA, 2 mM DTT). Fractions from this column were collected and diluted to 150 mM NaCl with buffer D without NaCl. The protein was then loaded onto a HiTrap SP Sepharose column (GE Healthcare) that was washed with buffer E (50 mM Tris [pH 7.5], 150 mM NaCl, 10% glycerol, 1 mM EDTA, 2 mM DTT) and eluted with a linear gradient from buffer E to buffer C. The peak fractions were combined and dialyzed overnight into storage buffer containing 20 mM Tris (pH 7.5), 500 mM NaCl, 20% glycerol, 0.1 mM EDTA, and 2 mM DTT. The protein was then concentrated using a 10,000-molecular-weight-cutoff concentrator (Sartorius) to approximately 10 mg/ml according to the Bradford assay (Bio-Rad).
Selenomethionine (Se-Met)-containing PF-8ΔC305-396(L60M, L167M, L258M) mutant protein was expressed by modification of a previously published protocol (12). Briefly, an overnight starter culture was used to inoculate M9 minimal medium that was supplemented with sterilely filtered 2 mM MgSO4, 0.1 mM CaCl2, 0.4% glucose, and 0.00005% thiamine. Once this culture reached an optical density at 600 nm of between 0.6 and 0.8, 100 mg of Thr, Lys, and Phe; 50 mg of Leu, Ile, and Val; and 60 mg of l-Se-Met were added to each liter. The cultures were allowed to grow for an additional 15 min at 37°C and then cooled and induced as described above for the native protein. The protein purification protocol was the same as that described for the native protein except that the concentration of DTT in each buffer was increased to 5 mM.
Concentrated protein was filtered through a 0.22-μm spin filter (Corning) to remove any debris. The protein was then combined with the precipitating solution 1:1 and crystallized by hanging-drop vapor diffusion at 22°C. Possible crystallization conditions were identified using crystallization screens from Hampton Research. Crystals for both the Se-Met and native proteins were obtained under conditions where either polyethylene glycol 3350 or (NH4)2SO4 was the precipitant. The Se-Met crystal that diffracted X rays to the highest resolution was grown in 18% polyethylene glycol 3350, 100 mM Tris HCl (pH 8.2), 0.2 M lithium chloride, and 20 mM DTT. The best crystal of native protein was grown in 1.75 M (NH4)2SO4, 100 mM Tris (pH 7.4), and 20 mM DTT. The crystals were cryoprotected by slow addition of the well solution plus additional precipitant and 20% glycerol for the Se-Met crystals or 20% ethylene glycol for the native crystals.
Se-Met and native data sets were collected at the Advanced Photon Source at the Argonne National Laboratory (Argonne, IL) on the ID24-E beamline. Images of the diffraction patterns were processed and merged using HKL2000 (43). The PHENIX software package was used to locate four heavy-atom sites corresponding to the three leucine-to-methionine mutations as well as M157 (1). Model building was initially done using Coot (11), refinement with REFMAC5 (41), and molecular replacement with Phaser (49). Figures were generated using the PyMOL Molecular Graphics System (http://www.pymol.org/).
Structural comparisons were done using DaliLite (20). For structure-based sequence analysis, the structures of UL44 (1T6L), UL42 (1DML-A), and PCNA (1AXC) were first aligned using Dali (19), and then the sequences for related proteins were aligned using ALIGN (21, 22). The helices and sheets were labeled according to the convention used for PCNA by Krishna et al. (27). For comparisons involving the N- and C-terminal domains of each protein, each protein was first divided into two pieces by removing the sequence corresponding to the connector loop, and then the halves were compared using DaliLite (20). UL42 (1DML) and PCNA (1AXC) were crystallized in complex with peptides, and these were removed for both the analysis and the diagrams.
A 28-mer peptide and a 15-mer peptide (QCLFQNNTSATVAMLYNFLDIPVTFPTP and MLYNFLDIPVTFPTP) corresponding to the C-terminal residues 985 to 1012 and 998 to 1012, respectively, of KSHV Pol-8 and a scrambled version of the 28-mer peptide (QPTCLPFNQTFPVNITLSFATYNVLMAD) were obtained from Invitrogen. The C termini of the 28-mers were synthesized as amides instead of carboxylic acids due to difficulties in synthesis. In vitro-translated KSHV Pol-8 and PF-8 were synthesized from pTM1-Pol-8 and pM1-PF-8 as described previously (29) using the TNT T7 Quick coupled transcription/translation reticulocyte system (Promega). The rapid-plate DNA synthesis assay was carried out using the in vitro-translated protein according to a previously described method (30) with modifications. Briefly, a 20-mer oligonucleotide primer (5′-GCCAATGAATGACCGCTGAC-3′) and a 5′-end-biotinylated 100-mer oligonucleotide template (5′-biotin-AGCACTATTGACATTACAGAGTCGCCTTGGCTCTCTGGCTGTTCGTTGCGGGCTCCGCGTGCGTTGGCTTCGGTCGTCCCGTCAGCGGTCATTCATTGGC-3′), in a 1.2:1 ratio, were heated to 90°C for 5 min and gradually cooled to room temperature. The annealed primer-template was diluted to 5 pmol/μl with phosphate-buffered saline and incubated overnight at 4°C on 96-well streptavidin-coated plates (Roche Applied Sciences). DNA synthesis was carried out after washing the plate once with phosphate-buffered saline. The 50-μl reaction mixture containing 1 μl of in vitro-synthesized Pol-8 and PF-8, 100 mM (NH4)2SO4, 20 mM Tris-HCl (pH 7.5), 3 mM MgCl2, 0.1 mM EDTA, 0.5 mM DTT, 2% glycerol, 40 μg/ml bovine serum albumin, 5 μM deoxynucleoside triphosphates, and 1 μM digoxigenin-11 (DIG)-dUTP (Roche Applied Science) was incubated in the absence or presence of each peptide for 1 h at 37°C. DNA synthesis was determined by the incorporation of DIG-dUTP using a DIG detection enzyme-linked immunosorbent assay kit (Roche Applied Science). From this kit, the anti-DIG-peroxidase was incubated on the plate for 1 h at 37°C, and its substrate 2,2′-azino-bis(3-ethylbenzthiazoline)-sulfonate added after washing. The absorbance measurement was taken at 405 nm on a microplate reader (Tecan Genios Pro, Grodig, Austria). The inhibitory threshold was set at 50%. The experiment was performed in triplicate.
Using the Swiss Protein Data Bank (PDB) Viewer (17), the crystal structure of HSV-1 UL30 (2GV9) was divided into two pieces (namely, the thumb domain to the C terminus, amino acids 957 to the end, and the remainder of the protein) and then docked as a rigid body onto the structure of the primer-template DNA from the RB69 crystal structure (PDB entry 1IG9). The pieces were placed in such a way that amino acid residues in contact with DNA would superimpose with structurally homologous residues from RB69. The Pol-8 sequence was aligned with that of UL30 using ClustalW (52), and the docked UL30 model was converted to a homology model of Pol-8, using the Swiss PDB Viewer. Except for a slight rotation of the thumb domain at a point at the connection between amino acids 822 and 823 in the thumb of Pol-8, the main chain of Pol-8 was treated essentially as a rigid body. Side chain rotamers were chosen manually at the intermolecular interfaces to avoid unfavorable steric clashes and to produce favorable contacts with the DNA. The DNA model at the distal end of the duplex was extended as B form.
The crystal structure of the PF-8 dimer was added, as a rigid body, to the Pol-8/DNA model in such a way that the DNA passed through the central cavity of the C clamp without steric clashes. At the same time, we sought to bring two exposed hydrophobic patches in contact with one another, one located on the C-terminal helix of the Pol-8 thumb and the other consisting of residues that are part of and adjacent to the connector loop of PF-8. Basic residues in the central cavity of PF-8 were allowed to rotate and form hydrogen bonds with the phosphate backbone of the DNA, if possible. For the purpose of counting hydrogen bonds in the model, we used a restrictive definition, where the distance between H donor and H acceptor was less than 3.3 Å and where the H donor/H bond/H acceptor angle was greater than or equal to 120 degrees. Also, slight adjustments to side chains on the Pol-8/PF-8 interface were made. Finally, a simple energy minimization was calculated using REFMAC5 (41), taking the Fourier transform of the atomic model and its phases, as calculated in an arbitrary large unit cell, as weak reference standards. That step ensured that small energy problems could be resolved without permitting the model, and especially its main chain, to change in significant ways.
Based on the crystal structures of UL42 (58) and UL44 (3) and on secondary structure predictions using PSIPRED (38), we predicted that PF-8 has an unstructured C terminus (amino acids 300 to 396). Additionally, deletion experiments indicated that amino acids 302 to 396 from the C terminus of PF-8 are not required for PF-8 to bind DNA, to bind to the KSHV DNA polymerase Pol-8, or to perform processive DNA synthesis in vitro (8, 9). Therefore, a truncated form of PF-8 lacking residues 305 to 396 was expressed in order to facilitate structural studies.
Initial attempts to solve the structure of PF-8ΔC305-396 from native protein crystals by molecular replacement, using the structure of either UL44 (1T6L) (3), UL44 in complex with the C terminus of UL54 (1YYP) (2), or UL42 in complex with the C terminus of UL30 (1DML) (58) as a phasing model, either intact or divided into subdomains, all proved to be unsuccessful. Methods to obtain experimental phases were then pursued, first by soaking crystals in solutions with a variety of heavy atoms and later using a Se-Met protein. The sequence of PF-8ΔC305-396 includes only two methionines, which would normally be insufficient to obtain experimental phases for a protein of its size. Therefore, three leucines (L60, L167, and L258) that were predicted by LOOPP (39, 51, 54) to be in well-structured and different regions of the protein were mutated to methionines. Since these are conservative mutations, they were predicted not to disrupt the structure of the protein. A single-wavelength anomalous dispersion data set of a crystal of PF-8ΔC305-396(L60M, L167M, L258M) containing Se-Met and a data set for the native protein were collected. The crystal structure of native PF-8ΔC305-396 was then solved by molecular replacement using the experimentally determined Se-Met structure as a search model. Figure Figure11 shows a portion of the experimental map from the Se-Met protein containing one of the four Se-Met sites used for structure determination.
The crystallographic data collection and refinement statistics are shown in Table Table1.1. The structure of PF-8ΔC305-396 was determined at a resolution of 2.8 Å with an Rcryst and Rfree of 0.267 and 0.297, respectively, for the native protein. PF-8ΔC305-396 crystallized with one molecule per asymmetric unit in the space group P6122 with unit cell dimensions for the native structure of a = b = 57.83 and c = 386.10. The ordered portion of the structure includes amino acid residues 4 to 112 and 123 to 300. Both the N and C termini, as well as a loop formed by amino acids 113 to 122, are disordered in the native structure.
PF-8ΔC305-396 (Fig. (Fig.2)2) is composed of two domains, an N-terminal domain and a C-terminal domain, which are connected by an interdomain connector loop that includes residues 146 to 163. Beneath the connector loop, the main nine-stranded antiparallel beta sheet extends from one end of the monomer to the other and forms the backbone of the molecule. Five of these strands are contributed by the C-terminal subdomain, and four belong to the N-terminal subdomain. The N-terminal and C-terminal domains are each capped by an additional, smaller, antiparallel beta sheet, which is five stranded in the N-terminal subdomain and four stranded in the C-terminal subdomain. On the back face of the molecule, the side opposite the connector loop, there are two alpha helices per subdomain, four in total, that are oriented antiparallel to one another.
The two subdomains of the PF-8 monomer are topologically and structurally similar to each other, with a root mean square deviation (RMSD) between halves of 3.5 Å (20, 28). The greatest similarity is seen when the four beta strands from the central backbone of the N-terminal domain, βF1, βB1, βC1, and βD1, are aligned with the four beta strands from the small beta sheet at the end of the C-terminal domain, βF2, βB2, βC2, and βD2 (Fig. (Fig.3A3A).
Previous work has demonstrated that both full-length PF-8 and truncated PF-8ΔC305-396 form dimers when free in solution and in complex with DNA (9). Indeed, PF-8 crystallized such that two molecules related by a crystallographic twofold axis formed a head-to-head dimer, similar to HCMV UL44 and unlike monomeric HSV UL42, and the head-to-tail trimeric PCNA (Fig. (Fig.2C2C and and4B).4B). Similar to UL44, PF-8 dimerizes by forming an extended antiparallel beta sheet, due to hydrogen bonding between the βI1 strands from two twofold-symmetry-related molecules and stabilized by additional interactions involving amino acids in the loop that connects βF1 and αB1 (Fig. (Fig.4C4C).
Previous studies showed that deletions in the N terminus or C terminus (PF-8Δ1-21 or PF-8Δ277-396) of PF-8 resulted in its inability to dimerize (8, 9). Since these deletions would disrupt the N-terminal sheet βA1 and helix αA1 and the C-terminal sheet βI2 (Fig. (Fig.2),2), they most likely disrupted the overall fold of the protein.
The structure of the PF-8 monomer is similar to the structures of monomers of the processivity factor UL42 and the presumed processivity factor UL44 from alpha- and betaherpesviruses and also is similar to that of the eukaryotic processivity factor PCNA (Fig. (Fig.4A).4A). Structural alignments indicate that the monomeric structure of PF-8 is most similar to that of UL44, with an RMSD of 2.9 Å (PDB entry 1T6L), followed by UL42 (1DML, 3.5 Å) and then eukaryotic PCNA (1AXC, 4.2 Å). The PF-8 structure can also be readily superimposed onto the C-terminal two-thirds of the structure of the bacterial beta clamp (2POL) even though each monomer of the beta clamp is approximately 1.5 times larger than PF-8, includes three subdomains, and is able to form a closed ring through dimerization. PF-8 fits to the C-terminal portion of the beta clamp structure with an RMSD of 3.4 Å and a sequence identity of 9% within the aligned portion.
Although the various herpesvirus processivity factors share a similar fold, their sequences are quite diverse and it was not obvious how to align their sequences correctly without structural information. With structural information now available for the known and presumed processivity factors of the alpha-, beta-, and gammaherpesviruses, their structures have now been aligned with one another and with PCNA, and it is straightforward to extend the alignment to other members within each class whose structures are currently unknown (Fig. (Fig.5).5). After structural alignment, the observed sequence identities for PF-8 with HCMV UL44, HSV UL42, and eukaryotic PCNA were 14%, 11%, and 11%, respectively (20, 28), percentages that would typically be considered insignificant (47). A comparison of the various proteins shows that a single leucine, at position 168 in PF-8, is conserved among all of the herpesvirus proteins (but not PCNA) and that another leucine, L269 in PF-8, is conserved in PCNA and all of the herpesvirus proteins except for that of varicella-zoster virus. These leucines are within 5 Å of each other, are buried in the C-terminal half of the protein, and do not have any known function other than serving as structural elements.
The structure-based sequence alignment of the processivity factors shows important similarities and differences in the lengths of various loops and in the types of amino acid residues that make them up. For example, one loop that is not present in the structure of PCNA and was previously implicated as an important factor in the quaternary structure of the herpesvirus processivity factors is the loop between βF1 and αB1 (3, 58). It was previously shown that UL44 contains hydrophobic residues in this loop and in beta strand βI1 that interact to stabilize the formation of UL44 dimers. When these hydrophobic residues, specifically L86, L87, and F121, are replaced with alanine, dimerization is disrupted (3). In contrast, although UL42 has two hydrophobic residues within the βI1 sheet (L157 and M158), there are a number of charged residues on both ends of the sheet, and there are additional amino acids (Q104, K105, and R106) within the loop between βF1 and αB1 that would potentially repel each other and prevent dimerization. This is consistent with HSV UL42 being a monomer in solution and in crystals (10, 14, 16, 46, 58). PF-8 is more similar to HCMV UL44, as this loop consists of only uncharged or hydrophobic residues that are capable of stacking against one another or forming hydrogen bonds. Both this loop and beta strand βI1 show sequence conservation within each herpesvirus class (Fig. (Fig.5).5). Based on our analysis, one would predict that all alphaherpesvirus processivity subunits are monomers and that all betaherpesvirus and gammaherpesvirus subunits are dimers.
Interestingly, BMRF1, the presumed processivity factor from the gammaherpesvirus Epstein-Barr virus was reported to form rings when visualized by electron microscopy (36). A crystal structure of this protein was recently deposited in the PDB (2Z0L), although we are unaware of any corresponding published paper to date. In the crystal structure, BMRF1 forms head-to-head dimers similar to those of PF-8. However, there is an additional contact, causing a tail-to-tail dimerization, which creates a second extended beta sheet at the opposite end of the monomer and thus forms a closed, tetrameric ring. This additional contact is reminiscent of the tail-to-tail extended beta sheet that was seen in the crystal structure of UL44 in complex with the C-terminal peptide of UL54 (2). In BMRF1, the additional beta interaction is stabilized by disulfide bond formation across a crystallographic twofold axis. In PF-8, the corresponding region includes a number of hydrophobic and aromatic residues, which might similarly support transient intermolecular association and which would otherwise be exposed to solvent. To our knowledge there have been no experiments to show whether BMRF1 does indeed oligomerize under physiological reducing conditions, and it was previously reported that BMRF1 stimulated the catalytic subunit at a 2:1 ratio (55), suggesting dimerization. For PF-8, only dimers are seen in solution (9) when the ionic strength is sufficient to keep the protein from aggregating (data not shown). Thus, it is unclear whether these higher-order multimeric structures are biologically relevant or are artifacts. Experiments have shown that both UL44 and PF-8 bind to DNA as dimers (3, 9, 35). However, most of these experiments were performed in the presence of excess DNA and do not rule out the possibility that a weakly associated tetramer could form on the surface of DNA, promoted by cooperative binding.
Another loop that shows significant variation among the herpesvirus processivity factors and PCNA is the 13-residue “D2E2” loop of PCNA. In PF-8, the 18 residues between amino acids 218 and 235 form the corresponding loop. Based on sequence comparisons between herpesvirus classes and on the structures of UL42, UL44, and PF-8 (Fig. (Fig.4),4), it appears that the alphaherpesvirus processivity factors generally have the longest D2E2 loops, consisting of approximately 30 amino acids, while the betaherpesvirus processivity factors are predicted to be the shortest, with a loop of approximately 5 residues. It has been suggested that this loop, which is an antigenic site for PCNA, might be involved in forming back-to-back dimers of PCNA trimers (22, 42). It is also possible that this loop is important for forming specific contacts with other proteins and that the variation in length may be related to the specificity of this interaction.
Another loop that shows variation among the three classes of herpesviruses (alpha-, beta-, and gammaherpesviruses) and conservation within these classes is the “gap” loop. This will be discussed in more detail below.
When the two halves of the UL42, UL44, PF-8, and PCNA monomers are compared (20) based on how closely the main chain traces can be superimposed, it appears that PCNA is much more symmetric. Its subdomains are more similar to one another than are those of the herpesvirus processivity factors. Of the herpesvirus proteins, UL44 is most symmetric, followed by PF-8 and then UL42 (Fig. (Fig.3).3). In fact, the C-terminal subdomain of UL42, UL44, or PF-8 is structurally more similar to either subdomain of PCNA than it is to its own N-terminal subdomain (Table (Table2).2). Additionally, when the structures of the herpesvirus processivity factors are compared to one another, their C-terminal subdomains are the most structurally conserved. Although the N-terminal subdomains of these proteins resemble one another more closely than do any of the C-terminal subdomains, overall the N-terminal domains show less conservation than the C-terminal domain, possibly because this terminus is involved in forming a head-to-head dimer in the case of PF-8 and UL44.
Perhaps the trimeric, head-to-tail, circular structure of PCNA limited the ability of the two domains to diverge structurally during evolution. In contrast, the herpesvirus proteins may exhibit greater variability between subdomains due to the geometry of interaction being less constrained when only one end, or neither end, of the protein is involved in multimer formation.
PF-8 is similar in structure to UL44 and UL42 in that it has a back face rich in basic residues (Fig. (Fig.6).6). The corresponding positively charged surfaces of UL42 and UL44 are important for binding to DNA (24, 25, 35), and it is reasonable to assume this is true for PF-8 as well, although to our knowledge, no studies have yet been reported on this issue. This basic face of each PF-8 monomer includes five arginines, nine lysines, and one histidine. This side of the PF-8 monomer includes approximately the same number of lysines and arginines as does a monomer of UL44, but with a different spatial distribution. One reason for this difference is that UL44 has five basic residues concentrated in a long, 17-residue, so-called “gap” loop between αA2 and βB2. In contrast, PF-8 has a much shorter, six-residue loop with only two positively charged residues (Fig. (Fig.5).5). Correspondingly, the remaining positively charged residues are distributed more uniformly along the surface of the central cavity of PF-8. Notably, the gap loop of UL44 is disordered in its crystal structures (2, 3), and mutating any of its five basic residues to alanine causes a measurable reduction in the ability to bind DNA (24). Somewhat surprisingly, those individual basic residues in the central cavity of UL44 that were tested appeared to be relatively less important, in that reduced DNA binding occurred only when several of them were mutated at once (24).
The structure of UL44 in complex with duplex DNA was recently modeled computationally (Fig. (Fig.7A)7A) (24). In that model, which is consistent with mutational and cross-linking analyses (24), UL44 binds to DNA as a dimer, with its gap loops wrapped around DNA in a number of different arrangements having roughly equal energies. Frequently, the two loops of the dimer were seen to interact on the opposite side of the DNA, encircling it completely in the model. When we superimpose PF-8 onto the UL44/DNA model (Fig. (Fig.7B),7B), it is clear that the loops in PF-8 that correspond to the gap loops of UL44 are not long enough to wrap around DNA in a similar way. Interestingly, a long gap loop is predicted to be a feature associated only with the betaherpesviruses, including HCMV, human herpesvirus 6 (HHV-6), and HHV-7 (Fig. (Fig.5),5), as it is not found in the structure of PCNA or UL42 or, based on our alignments (Fig. (Fig.5),5), in other gammaherpesvirus processivity factors. Given the importance of the gap loops to DNA binding in UL44, the mechanism by which PF-8 is tethered to DNA might be different from that of UL44, even though both processivity factors form stable head-to-head C-clamp dimers.
The only previously reported data on the interaction between PF-8 and Pol-8 showed that when the N-terminal residues (PF-8Δ1-21 and PF-8Δ1-27) or C-terminal residues (PF-8Δ277-396 and PF-8Δ279-396) of PF-8 are truncated, it is unable to bind to Pol-8 (8, 9). These regions are critical for the overall fold of PF-8 and, based on the structure, would be predicted to affect the overall fold of the protein (Fig. (Fig.22).
By analogy with the binding of UL42 and UL44 to their respective catalytic subunits (2, 32, 33, 58), we would expect the binding between Pol-8 and PF-8 to be mediated by the extreme C terminus of Pol-8 and to involve residues in and around the connector loop of PF-8. In PF-8, the connector loop is longer than the corresponding segments of UL42, UL44, or PCNA (Fig. (Fig.5).5). When the atomic model of PF-8 is considered in isolation, a patch on the PF-8 surface, including a portion of the connector loop as well as a nearby portion of the beta sheet (Fig. (Fig.8),8), appears to have a number of solvent-exposed aromatic and hydrophobic side chains. The presence of such exposed hydrophobic chains would be energetically unfavorable and may explain, at least in part, why PF-8 is difficult to keep in solution at physiological ionic strength levels (data not shown). In fact, in the PF-8 crystal structure these hydrophobic residues are not exposed to solvent but instead participate in forming crystal-packing contacts with other “exposed” hydrophobic residues that form part of the base of the C-terminal subdomain of a neighboring molecule of PF-8.
To begin characterizing the interaction between PF-8 and Pol-8, DNA synthesis by Pol-8 and PF-8 was monitored in the presence of peptides corresponding to the C-terminal 15 or 28 residues of Pol-8 or a scrambled version of the 28-mer (Fig. (Fig.9).9). The peptide corresponding to the 28 C-terminal residues of Pol-8 inhibited with a 50% inhibitory concentration of ~50 μM. The 50% inhibitory concentration of the random peptide or the peptide corresponding to the last 15 C-terminal residues was greater than 250 μM. These results are consistent with the idea that at least some of the residues at the C terminus of Pol-8 are involved in binding to PF-8.
Due to the extensive sequence conservation between herpesvirus DNA polymerase catalytic subunits (>35% identity), it was relatively straightforward for us to align the sequence of Pol-8 with UL30 and to create a homology model of Pol-8 by mapping its sequence onto the UL30 crystal structure (31) (as shown for the thumb region in Fig. Fig.10).10). The high degree of sequence conservation gave us confidence in using the UL30 backbone conformation as a model. However, starting with the last four residues of the thumb domain (amino acids 1194 to 1197 in UL30 and 1005 to 1008 in Pol-8) and into the C-terminal extension (amino acids 1198 to 1235 in UL30 and 1009 to 1012 in Pol-8), there is no detectable homology between Pol-8 and UL30 (Fig. (Fig.10).10). In the crystal structure of UL30, the final four residues of the thumb domain form a short four-residue helix. Although those residues are ordered, we have chosen not to include the corresponding residues of Pol-8 in our model of the complex, because there is no obvious sequence homology and because proline 1006 of Pol-8 might prevent helix formation. Note that there is room in the model of the complex for the C-terminal tail to exit the thumb domain, without clashing. While we would not be surprised if the missing eight residues of Pol-8 played a role in binding, we are reluctant to make a specific conformational suggestion without having an established structure to work from. Interestingly, the Epstein-Barr virus catalytic subunit completely lacks this C-terminal extension (see Fig. S2 at https://coen.med.harvard.edu). Indeed, several residues from the segment of Pol-8 that are 16 to 28 residues from the C terminus map onto the last and next-to-last alpha helices of the thumb domain. These observations led us to the hypothesis that the interaction between Pol-8 and PF-8 involves residues that we are able to map onto the crystallographically structured thumb of UL30. Importantly, the homology model positions a number of aromatic and hydrophobic side chains on one of the solvent-exposed surfaces of the C-terminal helix of the thumb. The observation that some of these hydrophobic residues align with charged or hydrophilic residues in the UL30 homolog suggests the possibility that the residues in question participate in forming a protein-protein interface in Pol-8 but not in UL30 (Fig. (Fig.10).10). Indeed, the binding of UL30 to UL42 has been well studied (6, 33, 58) and has been shown to be dependent predominantly on interactions with the portion of the C terminus of UL30 that is not homologous with or present in Pol-8 (consisting of the C-terminal 36 amino acids and, most importantly, the C-terminal 18 amino acids).
Collectively, these structural observations led us to hypothesize that two groups of solvent-exposed hydrophobic residues, one including the PF-8 connector loop and the other involving the C terminus of Pol-8, might contact one another directly and shield one another from solvent, as part of the basis for processivity factor binding by polymerase catalytic subunits in the gammaherpesviruses.
As a first step toward testing this hypothesis, we wished to determine if we could model PF-8 in complex with Pol-8 and DNA, while maintaining contacts between the exposed hydrophobic residues on Pol-8 and PF-8 and while simultaneously running DNA through PF-8, in a reasonable manner, and keeping the main chains of PF-8 and Pol-8 unchanged. Using the homology model of Pol-8, the structure of PF-8, and the structure of the bacteriophage RB69 DNA polymerase in complex with DNA (PDB entry 1IG9 ), we were able to construct a rigid-body model of the ternary complex of PF-8, DNA, and Pol-8 (Fig. (Fig.11).11). Although this model does not include the final eight residues of Pol-8 (the final eight residues of Pol-8 did not have homologous residues in the sequence alignment with UL30 [Fig. [Fig.10]),10]), the model does allow space for the C-terminal tail of Pol-8 to exit the complex and to interact with PF-8. In order to determine the approximate location of DNA in the model, we first superimposed the homology model of Pol-8 onto the structure of the RB69 DNA polymerase bound to the primer and template strands of DNA.
Primarily, we were interested in assembling a herpesvirus polymerase structure that was structurally complementary to the RB69 DNA-bound polymerase structure, built in such a way that the amino acid residues that were believed to contact the DNA (having either a catalytic or DNA binding role) would superimpose as closely as possible with the homologous residues from the RB69 DNA polymerase. Although we were prepared to break UL30 into several rigid subdomains, as was done by Liu et al. (31), to produce the expected interactions with DNA, we were surprised to find that it was sufficient to break UL30 into two rigid bodies: the thumb domain, and the remainder of the molecule. Relative to its orientation in the UL30 crystal structure, the thumb domain (cyan in Fig. Fig.11)11) was rotated very slightly and was connected to the remainder of the molecule via a hinge point near residue 823. Finally, to complete this portion of the model, minor modifications were introduced to replace UL30 with Pol-8 and to extend the straight B-form DNA duplex further away from the active site.
In the second phase of the docking, the PF-8 C-clamp dimer was moved to a series of positions and orientations such that there were favorable contacts between the DNA and PF-8, and the position of the thumb domain was slightly adjusted to maintain both the hydrophobic contact and the hinge between the thumb and the remainder of Pol-8. These constraints were restrictive enough that nearly all of the arrangements of domains were unsuccessful. However, ultimately one of the modeled positions satisfied all of the stated requirements. The protein side chains that were potentially in contact with the DNA were then rotated manually to find low-energy rotamers, avoid steric clashes, and produce transient hydrogen bonds (mostly to demonstrate their conformational feasibility and not because we would expect them to necessarily form simultaneously while the protein is diffusing along DNA).
Our model of the ternary complex predicts the identities of a number of amino acids that may be important for the interaction between Pol-8 and PF-8 (Fig. (Fig.11).11). On Pol-8, these amino acids include Y1000 (from the C-terminal helix) and V864 and Y865 (from elsewhere in the thumb domain sequence). On PF-8, we find that residues from the connector loop (I150 and F153), as well as from the underlying beta sheets (V57, I276, L296, L298, and V300), are potentially involved in binding to the thumb region of Pol-8. In addition to the hydrophobic residues that we see making direct contact, there are other solvent-exposed hydrophobic residues nearby that may be involved in the interaction and that we predicted to be important prior to modeling but for which simple rigid-body modeling does not account. Notably, these include hydrophobic side chains that align with polar or charged amino acids in the homologs. On Pol-8, these include V996 and A997 from the C-terminal helix and the C-terminal FPTP sequence, which is not present in the UL30 homolog structure. On the connector loop of PF-8, another solvent-exposed hydrophobic amino acid, L151, may also be involved in binding. We infer that, in this respect, Pol-8 must be unlike its homologs, where a continuous C-terminal polypeptide extension folds to create an intermolecular contact surface. In both Pol-8 and PF-8, we predict that the two complementary hydrophobic patches are each formed by residues from two or more portions of the linear sequence (see residue lists above). If true, this may also explain why a 28-amino-acid peptide from the C terminus of Pol-8 inhibits the association between Pol-8 and PF-8 only at relatively high concentrations (>50 μM). These predictions can be tested but are outside the scope of the present report.
In addition to interactions between PF-8 and Pol-8, our model also predicts the nature of the interaction between PF-8 and DNA. In overall orientation, the DNA duplex enters the central cavity of PF-8 at an angle of ~30 degrees relative to PF-8 (Fig. 11B) and, from above, DNA traverses the cavity at an angle of 45 degrees relative to the dimeric interface and is roughly parallel to the αA2 helices of PF-8 (Fig. 11C). The angle at which DNA enters the central cavity of the PF-8 model is thus more similar to the 22-degree angle seen in the crystal structure of the beta clamp (15) than it is to the 10-degree angle predicted by computational modeling of UL44 in complex with DNA (24). If the differences are real, one possible explanation for the difference may be the absence of long gap loops in the PF-8 structure and the lack of a long flexible C-terminal tail on Pol-8.
The model of the complex also allows us to estimate the number of hydrogen bonds that could be formed, transiently, between basic side chains of PF-8 and phosphate groups of the DNA backbone. On one of the PF-8 monomers, there are two hydrogen bonds from R181, one from R257, and one from K186. The other monomer of PF-8 is simultaneously making two hydrogen bonds through R20, one through K186, and one through R21. Altogether, this corresponds to six amino acids being involved in hydrogen bond formation, forming eight hydrogen bonds in total. It is important to note that the model shown here is only a snapshot of the potential interactions between PF-8 and DNA. We count an additional 24 basic amino acids on the dimer that, depending on the register of the DNA, could potentially be part of the PF-8/DNA hydrogen bonding network and, at the very least, contribute to the positively charged nature of the proposed DNA binding cavity of PF-8 and thus a general electrostatic interaction between the molecules. The total number of hydrogen bonds predicted by our model for PF-8 is consistent with a biochemical study of UL44 that determined the effects of ionic strength on DNA binding. According to that experiment, each UL44 dimer forms approximately eight charge-charge interactions, formed between basic protein side chains and DNA phosphates (35). In contrast, a computational modeling study (24) on the interactions between UL44 and DNA predicted that as many as 16 hydrogen bonds per UL44 dimer could form simultaneously. Note, however, that such computational studies are designed to maximize the number of favorable interactions that they find and thus may represent upper estimates, rather than typical ones.
Our findings invite speculation regarding the evolution of processivity factors. The functional and structural homologies between the herpesvirus processivity factors and the sliding clamps suggest that they share a common ancestor. We have previously speculated that HSV UL42, being structurally and mechanistically (in terms of its mode of binding to DNA) the simplest of proteins, might be most similar to the processivity factor of the common ancestor (58). In this view, evolution would have selected for the added functions and multimerization of the more complicated sliding clamps. However, given that sliding clamps are found in bacteria, archaea, and eukaryotes (27, 37) and are thus evolutionarily ancient, it seems plausible that the ancestor of herpesviruses captured genes encoding sliding clamps from a host organism. The constraints in coding capacity of viruses may have selected for processivity factors that retained the structural fold of sliding clamps but no longer needed complicated, multisubunit clamp loading proteins. In this view, the monomeric HSV UL42 protein would resemble the captured form, which likely would originally have been less positively charged. During evolution, there would have been selection for higher- affinity binding to DNA. Subsequent selection for more clamp-like DNA binding and/or more options for protein-protein interactions might have then led to the evolution of dimeric proteins in a common ancestor of beta- and gammaherpesviruses.
After this article was accepted, a paper by Murayama et al. (K. Murayama, S. Nakayama, M. Kato-Murayama, R. Akasaka, N. Ohbayashi, Y. Kaemwari-Hayami, T. Terada, M. Shirouzu, T. Tsurumi, and S. Yokoyama, J. Biol. Chem., doi: 10.1074/jbc.M109.051581, 2 October 2009, posting date) on the crystal structure of the Epstein-Barr virus polymerase processivity factor BMRF1 was published.
We thank Piotr Sliz and Kevin Corbett for help with data collection and analysis. We also thank all members of the Hogle and Coen labs for useful discussions and advice, with special thanks to Laurie Silva, My Sam, John Genova, and Gloria Komazin-Meredith.
The crystallographic study is based upon research conducted at the Northeastern Collaborative Access Team beamlines of the Advanced Photon Source, which is supported by award RR-15301 from the National Center for Research Resources at the National Institutes of Health. This work was supported by grants AI019838 and AI026077 to D.M.C. and J.M.H. and by grants NIH RO1 DE16665 and CCSG P30 CA016520 to R.P.R.
Published ahead of print on 16 September 2009.