|Home | About | Journals | Submit | Contact Us | Français|
Structural studies have made significant contributions to our understanding of Sulfolobus spindle-shaped viruses (Fuselloviridae), an important model system for archaeal viruses. Continuing these efforts, we report the structure of D212 from Sulfolobus spindle-shaped virus Ragged Hills. The overall fold and conservation of active site residues place D212 in the PD-(D/E)XK nuclease superfamily. The greatest structural similarity is found to the archaeal Holliday junction cleavage enzymes, strongly suggesting a role in DNA replication, repair, or recombination. Other roles associated with nuclease activity are also considered.
Viruses are the most abundant biological units on earth (3). However, while more than 5,000 different viruses infecting bacterial and eukaryotic hosts have been described in detail, only about 50 viruses have been described from the archaeal domain of life (29, 40, 47). Interestingly, these archaeal viruses, particularly those infecting the Crenarchaea, exhibit an array of unusual morphologies (47) and unique genomic content. This has led to recognition of seven new viral families, with two additional viral families awaiting assignment (28, 48, 58, 59). However, functional annotations for most gene products in these crenarchaeal viruses have not been made, and we possess only a rudimentary understanding of their fundamental viral processes, including mechanisms of attachment, uptake, transcriptional regulation, genome replication, and virus assembly and release (29, 40, 48).
Sulfolobus spindle-shaped viruses (SSVs), or Fuselloviridae, are among the best-studied crenarchaeal viruses. Members of this family include Sulfolobus spindle-shaped virus 1 (SSV1) (32) and Sulfolobus spindle-shaped virus Ragged Hills (SSVRH) (58). SSV1 is the type virus for the family, and along with SSVRH, exhibits a characteristic 60- by 90-nm spindle- or lemon-shaped morphology with short tail fibers at one end (43, 58). These viruses contain double-stranded circular DNA genomes that encode 34 and 38 open reading frames (ORFs), respectively (58). Early analyses of the SSV1 genome revealed only two ORFs with similarity to genes of known function. ORF D335 encodes a viral integrase, although this gene appears to be nonessential (2, 11, 36), while B251 exhibits limited similarity to DnaA (23). More recently, improved bioinformatic methods have noted additional similarities (17, 48). However, 26 of 34 ORFs from SSV1 are still not reliably identified by bioinformatics.
An early study by Reiter et al. identified three proteins in the SSV1 viral particle (51). VP1 and VP3 are structural proteins (51), while VP2 is thought to function as a small, packaged, DNA binding protein (52, 58). More recently, two additional components that copurify with the viral particle have also been identified, C792 and D244 (33). C792 is a predicted membrane protein and thus likely to serve a structural role, while D244 is a soluble protein of unknown function (33). Structural studies have also contributed to our understanding of the SSV1 proteome, with crystal structures of D63, F112, and F93 revealing distant evolutionary relationships and potential functions for these proteins (26, 27, 29, 33). Unfortunately, attempts to crystallize D244 for structural annotation have been unproductive, despite significant efforts (B. J. Eilers and C. M. Lawrence, unpublished data). However, D244 orthologs are present in many Fuselloviridae (58), including SSVRH D212, which shows 80% sequence identity to SSV1 D244. Here we present the initial biochemical and structural analysis of SSVRH D212 and discuss the functional implications.
ORF D212 was amplified using nested PCR, and SSVRH DNA was purified as described previously (52). Primers introduced a Shine-Dalgarno sequence, an N-terminal His6 tag, and attB sites. The internal forward and reverse primers were 5′-ATGCATCACCATCACCATCACATGACCGAAACTGATTTTAA-3′ and 5′-GTACAAGAAAGCTGGGTCCTACTTATTACTCCTAGCTAAAT-3′, respectively, while the external forward and reverse primers have been described previously (33). The amplified product was inserted into pDONR201, its sequence was verified, and then the product was transferred into pDEST14 (Invitrogen), yielding pEXP14-D212.
D212 was expressed in Escherichia coli as described for other fuselloviral proteins (26, 29, 33). Purification of D212 also followed this earlier protocol, which included cell lysis, heat denaturation, immobilized metal affinity chromatography (IMAC), and size exclusion chromatography. Lysis buffer was composed of 20 mM Tris, pH 8.0, 20 mM NaH2PO4, 1 M NaCl, 5 mM imidazole, and 0.1 mM phenylmethylsulfonyl fluoride. IMAC elution buffer was 20 mM Tris (pH 8.0), 300 mM NaCl, and 200 mM imidazole. Buffer for size exclusion chromatography was 10 mM Bis-Tris (pH 6.5) and 100 mM NaCl. Protein concentrations were determined by Bradford assay with bovine serum albumin (BSA) as a standard (33).
D212 was concentrated to 10 mg/ml using Amicon spin concentrators and crystallized using hanging drop vapor diffusion. Drops were set up at 24°C using 2 μl of D212 and 2 μl of well solution consisting of 0.1 M HEPES (pH 7.5), 0.05 M K2HPO4, and 10% (wt/vol) polyethylene glycol (PEG) 8000. Crystals up to 0.1 × 0.05 × 0.02 mm in size were obtained in 2 weeks. The crystals were transferred to well solution containing 25% glycerol as a cryoprotectant for 10 min and then flash-frozen in liquid nitrogen. Heavy atom soaks were carried out for 1 h in 10 mM K2PtCl4 or 30 s in 0.5 M NaBr in phosphate-free well solution containing 25% glycerol. A single-wavelength data set centered on the Br K edge and a three-wavelength data set at the Pt L edge were collected at the Stanford Synchrotron Radiation Laboratory. A native data set was also collected. Data were integrated, scaled, and reduced in space group P21 to a resolution of 2.4 Å using HKL-2000 (41) (Table (Table11 ). For the Mn2+ metal soaks, native D212 crystals were transferred to phosphate-free well solution augmented with 25% glycerol and 10 mM MnCl2. The crystals were soaked overnight at 24°C and flash-frozen. Data were collected at 100 K using our Cu-Kα home source and a MAR345 image plate detector.
The structure was solved using multiple isomorphous replacement with anomalous scattering (MIRAS). SOLVE (55) was used to determine the positions of two Pt atoms and one Br atom per asymmetric unit and to calculate initial phases. RESOLVE (54) was used for density modification and initial model building. Cycles of iterative model building with COOT (15) and refinement with REFMAC5 (35) led to the final model. Translation/libration/screw (TLS) parameters (42) were included in the refinement with each D212 subunit divided into 15 TLS groups. The final model, which includes 75 water molecules, results in crystallographic (Rcryst) and free (Rfree) R factors of 20.9% and 24.5%, respectively (Table (Table22 ). Model quality was assessed with MolProbity (31), with 98.1% of the residues in the most favored regions and the remaining 1.9% in additionally allowed regions of the Ramachandran plot. Figures were generated with PYMOL (14) and TOPDRAW (7).
Fixed and mobile four-way junctions were assembled and purified as described in the work of Birkenbihl et al. (6). Holliday junction cleavage activity was assayed as described for the Sulfolobus islandicus rod-shaped virus 1 (SIRV1) and SIRV2 Holliday junction cleavage enzymes with only minor modifications (6). Specifically, the 5′-32P-end-labeled fixed and mobile Holliday junctions (10 nM) were incubated at 25°C for 2 h in a 10-μl reaction mixture that also contained 1 μM D212, 10 mM Bis-Tris (pH 6.5), 100 mM NaCl, 1 mM dithiothreitol (DTT), and 10 mM divalent cation, either MgCl2 or MnCl2. T4 endonuclease VII was used as a positive control.
Coordinates and structure factors were deposited in the Protein Data Bank (PDB ID 2W8M).
The D212 construct consists of 212 residues of the full-length protein plus an N-terminal His6 tag, giving a calculated mass of 25,895 Da. It elutes from the size exclusion column with an apparent molecular mass of 50 kDa, suggesting that D212 is present in solution as a homodimer. D212 crystallizes in space group P21 with two copies of the D212 protomer in the asymmetric unit. Lacking methionine residues, the structure was determined at 2.4-Å resolution by multiple isomorphous replacement with anomalous scattering (MIRAS). Consistent with size exclusion chromatography, the structure reveals a homodimer.
The D212 polypeptide folds to form a mixed α/β structure composed of five α-helices and eight β-strands (Fig. (Fig.1)1) that can be divided into two subdomains which we refer to as the “catalytic domain” and the “DNA binding domain,” respectively (see below and reference 45). The catalytic domain is the central domain of the dimeric protein and is a doubly wound α/β structure. It is composed of a predominantly antiparallel five-stranded β-sheet (β1, β2, β3, β5, and β6) that is flanked on one side by helices α1 and α6 and on the other by helix α2. The second, smaller domain, the DNA binding domain (45), is composed of a 3-stranded antiparallel β-sheet (β4, β7, and β8) with helix α3 lying along one side and helices α4 and α5 running across one face. In addition, a number of the connecting loops are disordered (shown as dotted lines in Fig. Fig.1).1). In chain B, disordered regions include residues 1 to 18 at the N terminus, residues 58 to 67 in the β1-α2 loop, residues 100 to 106 connecting β3 to α3, residues 172 to 174 in the β7-β8 loop, and residues 204 to 212 at the C terminus (Fig. (Fig.1).1). Although exact residue numbers differ, similar disorder is seen for chain A. The two subunits show similar topology over most of the structure and superimpose with a root mean square deviation (RMSD) of 0.62 Å. The only notable difference between the two subunits is the absence or presence of the short α3 helix, which is disordered in chain A. The dimer interface is formed by several N-terminal residues, helix α1, and strand β1. The α1 helix lies at the heart of the interface, where it interacts with the symmetry-related helix in an antiparallel side-by-side fashion. Together, these elements form a hydrophobic interface that buries ~1,090 Å2 of the subunit surface.
Consistent with its calculated pI of 9.0, a search for structurally similar proteins using the DALI (20) server identified members of the PD-(D/E)XK nuclease superfamily as the closest structural homologues. This diverse endonuclease family was initially identified in structurally characterized type II restriction enzymes but has subsequently been recognized in many other enzymes involved in the “3 Rs” of DNA replication, repair, and recombination (38). Consistent with the lack of functional annotation for D212 and its orthologs, members of the PD-(D/E)XK nuclease superfamily can exhibit extreme variability in their amino acid sequences, making it difficult to recognize highly diverged members (38). Among this family, D212 was found to be most similar to the Holliday junction cleavage enzyme from Sulfolobus solfataricus (Sso Hjc, PDB ID 1HH1), giving a DALI Z-score of 9.0, with an RMSD of 3.0 Å for 115 equivalent residues showing 12% identity (8). Other proteins with significant similarity include additional archaeal Holliday junction cleavage enzymes, an atypical homing endonuclease (I-SspI), and several type II restriction endonucleases. I-SspI, the atypical bacterial homing endonuclease (PDB ID 2OST), gives a DALI Z-score of 8.5, with an RMSD of 3.2 Å for 117 equivalent residues showing 12% identity (61). Among the restriction endonucleases, HincII (PDB 2GIG) gives a DALI Z-score of 5.0 and an RMSD of 3.4 Å for 116 equivalent residues with 9% identity (21). Importantly, the structures of I-SspI and HincII have been determined in complex with DNA.
Comparison of the S. solfataricus Holliday junction cleavage enzyme (Sso Hjc) with D212 highlights the similarity between the two folds (Fig. (Fig.2).2). Despite the additional length, 212 residues compared to ~145 in the archaeal Holliday junction cleavage enzymes, D212 largely lacks additional secondary structural elements. A minor difference is replacement of helix α3 in D212 with strand β4 in Sso Hjc (Fig. (Fig.2).2). Other than this, the increased size of D212 is largely a result of increased loop lengths in regions that connect secondary structural elements and an additional 10 residues at the N terminus. Interestingly, the N-terminal residues are disordered in both D212 (18 disordered residues) and Sso Hjc (8 disordered residues), where these residues have been implicated in DNA binding and are thought to be ordered only upon interaction with DNA (8). Similarly, I-SspI (atypical homing endonuclease) with 151 amino acids is also largely similar, although, consistent with the lower Z score, it exhibits additional differences (Fig. (Fig.2).2). These are particularly noticeable in the DNA recognition domain (61), where I-SspI inserts two additional β-strands and loses helix α4.
In contrast to Sso Hjc and I-SspI, HincII (type 2 restriction endonuclease) with 257 amino acids is significantly larger than D212. Nevertheless, the secondary structural elements in both domains of D212 match closely to the core nuclease structure in this restriction enzyme (21), and the major differences between the structures are the presence of several embellishments in the HincII structure. These include the addition of two α-helices and a β-strand at the N-terminal end, a small saposin-like domain that is inserted following the first β-strand, and 2 α-helices at the C-terminal end (21). In addition, the last helix in D212 adopts a different orientation than that seen in HincII.
The archaeal Holliday junction cleavage enzymes, homing endonuclease (I-SspI), and type II restriction enzymes that show structural similarity to D212 are all members of a subclass of the PD-(E/D)XK nuclease family that contain an extended motif with an extra N-terminal acidic residue (E), yielding an E-PD-(D/E)XK active site motif that contains a trio of acidic residues. However, many of the individual residues in this motif are not strictly conserved, particularly the proline and lysine residues. Superposition of Sso Hjc and HincII on D212 identifies an equivalent active site motif in D212 (Fig. (Fig.33 and and4).4). Specifically, Glu28, Asp79, and Asp97 in D212 correspond to residues Glu12, Asp42, and Glu55 in Sso Hjc, respectively, while in HincII, these residues correspond to Glu35, Asp114, and Asp127. Importantly, these residues are also strictly conserved in all of the D212 fuselloviral orthologs, including SSV1 D244. In contrast, I-SspI makes several substitutions, utilizing Asp11, Asp36, and Gln49. These clusters of acidic active site residues generally function to coordinate a metal ion, either Mg2+ or Mn2+, that is thought to play multiple roles in catalysis. These include facilitating the interaction with substrate DNA, activating water for nucleophilic attack, and stabilizing the transition state (1, 10, 25, 39, 61).
Similarly, the lysine residue in the E-PD-(D/E)XK motif can also serve in substrate recognition, to orient and stabilize the attacking hydroxyl nucleophile or to play a role in transition state stabilization (25, 61). While structure-based sequence alignments do not show a lysine residue in the equivalent position in D212, Lys123 is in close proximity to the three acidic active site residues and might serve a similar role. This would be consistent with reports on the migration of active site lysine residues in type II restriction endonucleases (9, 16, 46, 53). Interestingly, HincII also shows a lysine at this position, which is also strictly conserved in the fuselloviral orthologs (Fig. (Fig.3).3). Finally, the proline residue of the motif is present in Sso Hjc but is absent from I-SspI, HincII, D212, and all D212 orthologs.
The structural similarity to Sso Hjc and HincII suggested that Glu28 and Asp79 of D212 would also serve to coordinate Mg2+ or Mn2+. This was tested by soaking D212 crystals in 10 mM MnCl2 and collecting a 3.7-Å data set on our home source diffractometer. Indeed, the difference density map for the Mn2+ soaked crystals (Fig. (Fig.4)4) revealed a single manganese ion per subunit at the predicted metal binding site, further confirming the role of these residues in D212. Importantly, the two active sites in the D212 dimer are separated by about 25 Å (Fig. (Fig.4B).4B). While this separation is similar to that seen in Holliday junction cleavage enzymes, this is significantly greater than that seen in type II restriction endonucleases and the homing endonucleases (I-SspI), where two active sites come together to cleave opposing strands in a single piece of double-stranded DNA.
The structure of an archaeal Holliday junction cleavage enzyme in complex with DNA has not been reported. Structures for two bacteriophage enzyme/DNA complexes, T7 endonuclease I (18) and T4 endonuclease VII (5), have been reported. Unfortunately, their lack of structural similarity to D212 makes it difficult to model a complex with D212. However, structures have been reported for I-SspI and HincII in complex with a single piece of double-stranded DNA, and these complexes can be docked on D212 by structural superposition. The fit of the HincII DNA appears to be a superior model, although it does clash with elements of the β3/β4 connection that are distal to the active site, which in the B chain includes helix α3. Nevertheless, the docking suggests that DNA is bound within the rather prominent clefts on the “sides” of the D212 dimer (Fig. (Fig.4)4) and is mediated by residues at the N terminus of helix α1, the α2/β2 loop, residues lying within the extended connection between β3 and β4, the N terminus of α4, β4, and the β7-β8 loop. Many of these elements lie in disordered or poorly ordered areas and are likely to become better ordered upon interaction with DNA. With regard to the clash of the β3-β4 connection, this mobile loop might rearrange to avoid clashes with the DNA. Alternatively, it might serve to discriminate between double-stranded DNA and a junctional DNA complex. This model predicts that the disordered β7-β8 loop will lie within the DNA major groove. A similar loop in HincII, known as the recognition or R loop, participates in base-specific interactions, suggesting a similar role for the β7-β8 loop in D212 (30). The surface electrostatics in the presence of Mn2+ are consistent with this binding model (not shown). In addition, we also note a very basic surface feature running across the “back side” of the dimer, between the two active sites, which might further facilitate substrate recognition.
The locations of the subunit interfaces in Hjc, HincII, and I-SspI differ substantially from those in D212. HincII recognizes a degenerate 8-bp sequence to make a blunt-ended cut. Thus, the dimer interface is positioned such that two active sites are brought to bear on opposite sides of a single piece of double-stranded DNA. Similarly, I-SspI utilizes two symmetrically arranged active sites to cleave a 23-bp target sequence, yielding 3-base, 3′ overhangs. In contrast, the active sites in D212 and Hjc are much further apart, ~25 Å. In the case of D212, this allows double-stranded DNA to be docked into each of the active sites (Fig. (Fig.4),4), perhaps representing two arms of a junctional complex. These differences suggest that D212 is more likely to catalyze junctional cleavage during DNA replication, repair, or recombination than as a restriction enzyme or homing endonuclease.
Due to the similarity with Sso Hjc, we tried to demonstrate D212 Holliday junction cleavage (Hjc) activity with both fixed and mobile Holliday junctions (6). However, using freshly purified D212 and an array of divalent cations, Hjc activity was not observed. In contrast, Hjc activity was observed for a positive control, T4 endonuclease VII. However, the potential for nuclease activity is clearly present, as older protein preparations that were proteolytically clipped did show metal-dependent nuclease activity (data not shown). But because this activity required excessively high protein concentrations, the activity of the clipped protein was not characterized further. Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) analysis of the crystals used to collect the native data set indicated that the crystal structure represents the intact protein rather than this degraded form.
The structure of D212 clearly reveals a type II restriction endonuclease fold and identifies the hallmark PD-(D/E)XK catalytic motif that is spatially conserved in this nuclease superfamily. This fold and its associated catalytic residues are also found in other classes of nucleases, where they play roles in DNA replication, repair, recombination, and other processes requiring nuclease activity (10). Structural examples include MutH (4); Vsr (56); RecB and λ-exonuclease (24), which are involved in DNA repair; archaeal Holliday junction cleavage enzymes like Hjc and Hje (8, 34, 37); TnsA, which is a component of TN7 transposase (19); and the homing endonuclease I-SspI (61). In addition, bioinformatics, biochemical, and/or genetic approaches have identified the PD-(D/E)XK motif in non-long terminal repeat (LTR) retrotransposases (60) and in proteins associated with excisionase activity (10, 49, 57).
The similarity of D212 to the archaeal Holliday junction cleavage enzymes suggests that it may function as a junctional resolvase. However, the characterized archaeal enzymes position their two active site centers on a relatively flat surface (8, 34, 37) and are predicted to recognize a planar, stacked-X Holliday junction (22, 34) like that seen in the structure of T4 endonuclease VII (5). In contrast, the predicted DNA binding channels in D212 run at oblique angles down opposite sides of the dimer. This is reminiscent of the structure of the T7 endonuclease I and suggests that D212 recognizes a junctional complex with substantially different geometry than that recognized by previously characterized archaeal enzymes, perhaps explaining the lack of activity in the Holliday junction cleavage assay. Like T7 endonuclease I, D212 may show strong selectivity for a DNA branch point with a specific geometric structure (18) and might exhibit significant discrimination for particular sequences within the junctional arms (13).
This putative activity might be employed to resolve intermediates in replication of the fuselloviral genome. Alternatively, D212 may function in a DNA repair pathway, and expression of D212 might play a role in correcting replication defects or other damage to the viral genome. In this light, it is interesting that SSV1 D244, a D212 ortholog, is thought to be packaged within the SSV1 viral particle (33). The presence of D212 in the viral particle could ensure availability upon infection of a new host cell, helping to protect the integrity of the viral genome from the acidic, hyperthermal environments in which these viruses reside.
Another possible function is suggested by the presence of the PD-(D/E)XK motif in ExiS from the Mycoplasma MAV1 phage and XhisH from Anabaena. ExisS is thought to participate in excision of the MAV1 genome, while XhisH is implicated in excision of the fdxN element from the Anabaena chromosome (10, 49, 57). SSVRH, SSV1, and other Fuselloviridae utilize a virally encoded integrase to integrate their viral DNA. However, the attP sites utilized for this reside within the integrase gene, with integration resulting in disruption of the integrase gene. As long as episomal copies of the viral genome remain, they can provide the necessary integrase needed to excise the viral genome. However, in their absence, it has not been clear if, or how, the viral genome might be rescued. This work raises the possibility that D212 might play a role in excising the viral genome. Finally, D212 might facilitate interviral recombination events between fuselloviruses, which are thought to be relatively frequent, or even participate in the excision of a covalent closed circular DNA virus from a concatemer of two integrated fuselloviruses, as suggested by Redder et al. (50).
This work was supported by the National Science Foundation (MCB-0628732 and MCB-0920312). Portions were carried out at the Stanford Synchrotron Radiation Laboratory with support from the Department of Energy and the National Institutes of Health.
Published ahead of print on 7 April 2010.