|Home | About | Journals | Submit | Contact Us | Français|
Yellow fever virus (YFV), a member of the Flavivirus genus, has a plus-sense RNA genome encoding a single polyprotein. Viral protein NS3 includes a protease and a helicase that are essential to virus replication and to RNA capping. The 1.8-Å crystal structure of the helicase region of the YFV NS3 protein includes residues 187 to 623. Two familiar helicase domains bind nucleotide in a triphosphate pocket without base recognition, providing a site for nonspecific hydrolysis of nucleoside triphosphates and RNA triphosphate. The third, C-terminal domain has a unique structure and is proposed to function in RNA and protein recognition. The organization of the three domains indicates that cleavage of the viral polyprotein NS3-NS4A junction occurs in trans.
Flaviviruses are arthropod-borne pathogens that cause a number of serious human diseases throughout the world. Despite their importance as human pathogens, the molecular mechanisms of flavivirus replication are minimally understood, hindering the development of effective antiviral therapies and vaccines. Yellow fever virus (YFV) is the prototype member of the Flavivirus genus. Like many other flaviviruses, YFV is transmitted by mosquitoes. Symptoms of YFV infection include kidney failure, internal bleeding, high fever, and hepatitis, which leads to the yellow coloring of the skin for which the disease is named. Although vaccination using a live attenuated strain has been successful for decades, yellow fever still causes over 30,000 deaths per year, mostly in Africa and South America.
In addition to YFV, the Flavivirus genus also includes West Nile virus, four serotypes of dengue virus, and Japanese encephalitis virus. Flavivirus is the largest genus in the Flaviviradae family, whose other genera are Hepacivirus and Pestivirus. The only characterized hepacivirus is hepatitis C virus (HCV), which is well studied due to its importance in chronic liver disease and in liver failure leading to transplant. Although many molecular details differ between HCV and the flaviviruses and their protein sequences are less than 20% identical, there are sufficient similarities in replication strategy to allow comparisons between the viral proteins.
The flavivirus genome is a ≈11-kb plus-sense RNA containing a 5′ cap (m7G5′ppp5′A) but lacking a 3′ poly(A) tail (16, 38). The genome encodes a 370-kDa polyprotein precursor, which is inserted into the membrane of the endoplasmic reticulum and processed to yield three structural proteins (C, M, and E) and seven replication proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) (16). Host proteases process the polyprotein at sites in the endoplasmic reticulum lumen and a viral protease cleaves at specific sites on the cytoplasmic side of the endoplasmic reticulum membrane. The viral serine protease is formed by the N-terminal ≈175 residues of the NS3 protein and a cofactor peptide within the NS2B protein (2, 15, 16, 34, 54).
The C-terminal 440 amino acids of the NS3 protein constitute a helicase region, based on sequence analysis (30). The precise biological functions of the NS3 helicase region are unknown, but it is thought to separate RNA daughter and template strands and to assist replication initiation by unwinding RNA secondary structure in the 3′ nontranslated region (NTR) (12, 42, 53).
Helicases catalyze a large number of nucleoside triphosphate (NTP)-dependent nucleic acid strand separation and remodeling reactions. They are classified into three superfamilies and one family based on several conserved motifs (19). The flavivirus helicase is a member of the DEAH/D box family within helicase superfamily 2. Seven conserved sequence motifs in superfamily 2 helicases, beginning near position 200 in flavivirus NS3, are associated with NTP hydrolysis and nucleic acid binding (11, 40, 50). Motifs I and II, also known as Walker A and Walker B, respectively (51), exist in all helicase superfamilies. Within the DEAH/D-box family of superfamily 2, the DEAH and DEAD subgroups are defined according to the sequence of the Walker B motif, which binds divalent cations. According to these rules of classification, flavivirus NS3 is a member of the DEAH subgroup.
Crystal structures have been reported for several superfamily 2 helicases, including the HCV helicase (13, 26, 55, 56), bacterial RecQ (7), bacteriophage T4 UvsW (39), yeast eIF4A (10), and bacterial UvrB (47). The helicase motifs form an NTPase active site at the interface of two domains that are common to all helicases. Many helicases also have a third domain. A cleft between the third domain and the two common domains is the binding site for single-stranded nucleic acid. A single strand of DNA has the same position and orientation with respect to the common domains in the structures of helicase-nucleic acid complexes (26, 49), suggestive of a common mechanism for coupling NTP hydrolysis and strand unwinding. Among many proposed mechanisms, an inchworm in which the single strand is translocated through the cleft by NTPase-associated hinging of the common domains (41, 49, 57) seems most plausible. This is based on the structure of a helicase-DNA complex that includes regions of both duplex and single-stranded DNA (49).
Among helicases of known structure, HCV helicase is most closely related to Flavivirus helicase. However, the YFV and HCV helicase sequences are only 17% identical overall and have especially weak similarity in the C-terminal region corresponding to the HCV helicase third domain. In contrast, helicases from different members of the Flavivirus genus are much more closely related, with 40 to 90% identical sequences.
NTPase activity was demonstrated for flavivirus NS3 and for constructs of the helicase region (8, 31, 32, 53). Recently, a conserved Q motif upstream of the Walker A motif was reported to be essential for the ATPase activity of DEAD-box helicases (43). Flavivirus NS3 has no conserved Q motif, although a glutamine residue upstream of Walker A was reported to be essential for ATPase activity of the Powassan flavivirus (18).
Little is known about the unwinding mechanism of flavivirus helicases, including the specificity for DNA and/or RNA duplex and a requirement for a 3′ or 5′ overhang. In most cases, only low levels of helicase activity have been reported for recombinant flavivirus helicases (31, 32), in contrast to Flaviviridae helicases from HCV and a pestivirus (25, 31, 32, 52). Association of the flavivirus helicase with other replicase proteins may influence activity. For example, the flavivirus NS3 helicase domain associates with NS5, which contains the RNA-dependent RNA polymerase (12, 24), and an NS5 requirement for helicase activity has been described (22, 31). However, high levels of helicase activity in the absence of NS5 or other replicase proteins were reported recently for a recombinant dengue virus helicase (6).
We report the cloning, expression, and purification of ATPase-active recombinant NS3 helicases from YFV and West Nile virus, the 1.8-Å crystal structure of the YFV NS3 helicase, and the 2.5-Å structure of its complex with ADP. The structure differs substantially from that of HCV helicase in some regions and provides a basis for further study of the molecular mechanisms of flavivirus replication and for rational development of antiflaviviral compounds.
cDNA helicase constructs from YFV and West Nile virus were amplified, respectively, from pYF23 derived from pACNR/FLYF, kindly provided by Charles Rice (9), and from pWN-CG, kindly provided by Richard Kinney. Constructs YFH and WNH each express 437 amino acids encoded by the flavivirus NS3 gene. Four other constructs express proteins that are longer at the N terminus: YF17 (454 amino acids), WN16 (453 amino acids), YF22 (459 amino acids), and WN21 (458 amino acids). cDNA constructs were inserted into plasmid pET28a (Novagen) encoding a 20-residue, thrombin-cleavable, N-terminal hexahistidine (His6) tag to create expression plasmids pYFH-JW, etc. Plasmids were transformed into Escherichia coli expression strain Rosetta(DE3) (Novagen). Bacterial cultures were grown in Luria broth (LB) supplemented with 50 mg/liter kanamycin and 35 mg/liter chloramphenicol at 37°C to an optical density at 600 nm of 0.7. Expression was induced with 0.4 mM isopropyl-β-thiogalactopyranoside (IPTG). Cultures were grown for an additional 5 h at 20°C. Cells were harvested by centrifugation at 5,000 × g at 4°C and pellets were stored at −20°C.
Recombinant helicases were purified by metal affinity chromatography. All purification steps were carried out at 4°C. Frozen cell pellets from a 1-liter culture were resuspended in 40 ml binding buffer (20 mM imidazole, 500 mM NaCl, 20 mM Tris, pH 7.9), and lysed by sonication (pulse: 10 s on, 15 s off; total, 10 min). The cell extract was centrifuged at 12,000 × g at 4°C for 30 min. The supernatant was filtered (0.45 μm; Millipore), loaded onto a HiTrap chelating column (Promega), and washed with binding buffer. Recombinant helicases were eluted with a 20 to 500 mM imidazole gradient in 500 mM NaCl, 20 mM Tris, pH 7.9. Recombinant helicases eluted at ≈150 mM imidazole, were dialyzed against 500 mM NaCl, 20 mM Tris, pH 7.9, first with and then without 2 mM EDTA, concentrated to 10 mg/ml using Centriprep-30 (Millipore), and stored at −20°C. The solubility of all recombinant helicase variants was sensitive to both ionic strength and temperature. Low ionic strength (less than 300 mM NaCl) or high temperature (≥25°C) dramatically decreased protein solubility.
The His6 tags were removed from constructs YFH and WNH by thrombin cleavage in a reaction mix containing 1 mg/ml helicase, 500 mM NaCl, 25 mM CaCl2, 20 mM Tris, pH 8.4, and 1.5 U/ml recombinant thrombin (Novagen). The reaction mix was incubated at 4°C for 4 days, filtered (0.2 μm; Millipore), and loaded onto a HiTrap chelating column. Untagged helicases were recovered with 500 mM NaCl, 20 mM Tris, pH 7.9, 50 mM imidazole. Purified, untagged helicases (YF-nh and WN-nh) were dialyzed in 500 mM NaCl, 20 mM Tris, pH 7.9, first with and then without 2 mM EDTA and concentrated to 10 mg/ml.
Plasmids pRosetta and pYFH-JW were extracted from the E. coli expression strain and cotransformed into E. coli strain B834(DE3), a methionine auxotroph. Cells were grown as for wild-type YFV helicase, substituting methionine assay medium (BD) with 25 mg/liter l-selenomethionine (SeMet; SIGMA) for LB. Purification and thrombin cleavage were done as described for the wild-type proteins.
ATPase activity was assayed by addition of 1 μl protein (0.1 mg/ml) to reach a final volume of 10 μl in a reaction mix containing 50 mM Tris, pH 8.0, 10 mM NaCl, 2.5 mM MgCl2, 1 μCi [α-32P]ATP (3,000 Ci/mmol; Amersham), and incubation at room temperature for 10 min. The reaction was quenched by addition of 20 mM EDTA. A 0.5-μl reaction mix was spotted onto a polyethyleneimine-cellulose sheet (Selecto Scientific). The sheet was developed by ascending thin-layer chromatography in 0.375 M potassium phosphate, pH 3.5, for 45 min, air-dried, and exposed to X-ray film (Kodak) for 15 min.
RNA helicase activity was assayed by 1-hour incubation at 30°C of a 10-μl reaction mixture containing 2 mM ATP, 1 mM MgCl2, 10 fmol RNA substrate in 25 mM HEPES, pH 8.0, 2 mM dithiothreitol, and 5% glycerol. The RNA substrate was the annealed product of 30-nucleotide (5′-CACCUCUCUAGAGUCGACCUGCAGGCAUCG) and 16-nucleotide (5′-CGACUCUAGAGAGGUG) oligomers (Integrated DNA Technology) designed to produce a 16-base-pair duplex with a 14-nucleotide 3′ overhang in the longer strand, which was 5′-end labeled with [γ-32P]ATP. The reaction was initiated by addition of 200 ng recombinant YFV protein or an equivalent volume of buffer and was quenched by addition of EDTA to a final concentration of 10 mM and of sodium dodecyl sulfate to a final concentration of 0.1%. Reaction products were separated by electrophoresis on a 12% acrylamide native gel and detected using a Typhoon scanner (Molecular Devices).
Crystals of wild-type and SeMet YFV helicase (YF-nh) were grown by hanging-drop vapor diffusion from a 1:1 mixture of protein (10 mg/ml in 500 mM NaCl, 20 mM Tris, pH 7.9, 10 mM MgCl2, 10 mM ATP or nonhydrolyzable analog AMPPNP [Sigma]), and reservoir solution (6% polyethylene glycol 10000, 100 mM HEPES, pH 7.5). The 8-μl drops were microseeded and equilibrated with reservoir. Crystals grew to an average size of about 200 μm by 50 μm by 50 μm. Crystals were harvested into cryoprotectant solution (reservoir solution with 35% ethylene glycol), soaked for 1 min, and flash-frozen in liquid nitrogen.
All diffraction data were collected at the Advanced Photon Source, Argonne National Laboratory, Argonne, IL, using beamlines as noted in Table Table1.1. Data were processed and scaled with HKL2000 (36). Statistics are summarized in Table Table11.
The structure of YFV helicase was solved by SeMet multiwavelength anomalous diffraction (MAD). The 13 methionine residues in 440 amino acids produced a MAD signal of about 5% of F. The primitive monoclinic unit cell (space group P21, a = 55.0 Å, b = 43.8 Å, c = 93.1 Å, β = 104.0°) contains one YFV helicase molecule per asymmetric unit, corresponding to a Matthew's coefficient of 2.20 Å3/Da and only 44% (vol/vol) solvent. The program SOLVE (46) was used to locate 12 Se sites, refine the Se partial structure, and compute phase probabilities. After solvent flattening, about 25% of the model was autobuilt into the 2.5-Å electron density using RESOLVE (45). An essentially complete model was built into this map with program O (23).
The initial model was refined against the 2.5-Å merged MAD data and then against the 1.8-Å data using the program REFMAC (33). Of three N-terminal residues remaining from the His-tagged construct, Ser185 and His186 are ordered. The natural sequence begins with Met187. In the final model, one loop (residues 397 to 404) is disordered and not included. No residues are in disallowed regions of the Ramachandran plot. Final statistics are summarized in Table Table11.
Structure superpositions with HCV helicase were done with O and, for domains 1 and 2, included all core secondary structures (excluding α1), all helicase motifs (excluding motifs I [Walker A] and V), and all loops of the same length and similar conformation. Probes of the structural database were done with DALI (20); electrostatic surface potentials were calculated with APBS (4); sequence alignment was done with CLUSTAL W (48). Coordinates are available from the Protein Data Bank with accession codes 1YKS for the free enzyme and 1YMF for the nucleotide complex.
In order to identify the minimal functional flavivirus helicase and suitable material for crystallization, NS3 helicase proteins were designed with variable N termini and the native C terminus, defined by the viral protease cleavage site between NS3 and NS4A. The protease-helicase linker within NS3 was assumed to encompass the nonconserved region corresponding to residues 171 to 186 in YFV NS3, based on sequence alignment and the crystal structure of a dengue virus protease domain (34). Three recombinant proteins were generated from the YFV helicase region (YFH, YF17, and YF22) and three from the West Nile virus helicase region (WNH, WN16, and WN21) (Fig. (Fig.1a1a).
The shortest proteins have N termini only 14 residues upstream of the Walker A motif, at Met187 in YFH and at Met183 in WNH (numbering based on the sequence of the processed NS3 protein). The midlength proteins begin at the last conserved residue in the protease domain, Gln170 in YF17 and at Gln167 in WN16. The longest proteins include the final β-strand of the protease domain and begin at Val165 in YF22 and at Ile162 in WN21. All six proteins were produced in soluble form in an E. coli expression system (Fig. (Fig.1B).1B). Analogous constructs of the dengue virus helicase were not obtained in soluble form.
ATPase activity of the six proteins was confirmed by radioassay. In order to compare their activities, assay conditions were chosen in which the radiolabeled ATP was incompletely hydrolyzed. The six purified proteins have roughly similar activities for ATP hydrolysis (Fig. (Fig.1C).1C). The specific activity of the YFH construct was 0.2 μmol ADP/min/mg protein. The YFH and WNH constructs have the shortest N terminus preceding a Walker A motif reported for an ATPase-active helicase. These two proteins also have no glutamine residue upstream of the Walker A motif. Thus, the observation that the ATPase of the Powassan flavivirus requires a Q motif upstream of the Walker A motif (18) does not extend to other flaviviruses. A Q motif in superfamily 2 DEAD-box helicases is essential to ATPase activity and specificity because its conserved glutamine (Q) recognizes the adenine base of ATP (43). Flavivirus helicases, which are DEAH-box helicases, do not have a conserved glutamine upstream of the Walker A motif and are not specific for ATP (53).
Based on the crystal structures of the dengue virus protease region (34) and of the YFV helicase region reported here, the flavivirus NS3 linker region is 11 to 16 residues in length (Glu176-Leu188 in YFV NS3). It is also highly variable in sequence. The ATPase data show that constructs with as few as two linker residues (YFH and WNH) have ATPase activity equivalent to that of longer constructs. In contrast to the ATPase activity, the RNA helicase activity of the three proteins from YFV was dissimilar. The longest construct (YF22) had substantial ATP-dependent activity in separation of RNA strands, whereas the two shorter constructs (YFH and YF17) had no detectable strand-separation activity in the same assay conditions (Fig. (Fig.1D).1D). The basis for these differences is unknown, but similar results have been reported for the NS3 helicase region from dengue virus type 2 (31). Recombinant proteins were placed into crystallization trials. Good-quality crystals were obtained from only the shortest construct of the YFV helicase after removal of the polyhistidine tag.
The 1.8-Å crystal structure of YFV NS3 helicase (residues 187 to 623) was solved by multiwavelength anomalous diffraction using selenomethionyl (SeMet) protein (Table (Table1).1). With the exception of one nonconserved, disordered loop (residues 397 to 404), the refined model is complete, including the natural NS3 C terminus and two N-terminal residues from the polyhistidine affinity tag (Ser185 and His186). YFV helicase has three domains of roughly equal size (Fig. (Fig.2A).2A). Domain 1 (residues 187 to 327) and domain 2 (328 to 488) are similar to the core domains of other superfamily 2 helicases, especially those of HCV helicase. However, domain 3 (residues 489 to 623) is strikingly different in the flavivirus and HCV helicases.
The seven sequence motifs of superfamily 2 helicases (19) occur in domains 1 and 2 and map to interdomain clefts (Fig. (Fig.2A).2A). Residues in the helicase motifs are involved in NTP binding and hydrolysis and in coupling the NTPase to RNA duplex unwinding by an unknown mechanism (11, 50).
The Walker A and B motifs (motifs I and II) bind Mg2+-NTP (Fig. (Fig.2b).2b). The nucleotide complex of YFV helicase is the first such structure for a helicase from the Flaviviradae family, as none of the HCV helicase crystal structures includes bound nucleotide. The ADP β-phosphate binds the Walker A motif (motif I; G201AGKT205) via hydrogen bonds to the backbone NHs of Gly 201 and Gly 203 and to the side chain of Lys 204. The adenine and ribose groups protrude from the protein, consistent with the lack of nucleotide specificity for the NTPase activity of flavivirus NS3 helicases (53). The protein may have direct contacts with ribose in more closed configurations of domains 1 and 2. The site is also ideally suited to receive the 5′-triphosphate end of an RNA substrate, consistent with evidence that the NTPase active site also catalyzes the RNA triphosphatase reaction (5).
In addition to lack of nucleotide specificity, the site does not engulf the nucleotide in a manner that would prevent binding of the 5′ end of an RNA polymer. The Walker A motif is flexible, varying from well ordered to disordered in crystals lacking bound nucleotide. Among the free enzyme crystals, only one (native 2, Table Table1)1) yielded strong electron density for this P loop. The flexibility of the Walker A loop is reminiscent of structural changes observed in other helicases upon NTP binding (7). However, the P loop of flavivirus helicase has the same conformation in the nucleotide complex and in the free enzyme.
The Walker B motif (motif II; D289EAH292) is expected to bind a divalent cation associated with nucleotide γ-phosphate. Neither the γ-phosphate nor Mg2+ was observed in the nucleotide complex of flavivirus helicase, in accord with hydrolysis of added Mg2+-ATP. However, nucleotide binding in the crystals is apparently triphosphate dependent. We were able to produce the nucleotide complex only with Mg2+-ATP, which was hydrolyzed, but not with nonhydrolyzable analogs or with ADP or ATP in the absence of a divalent cation.
Motif VI (S457AAQRRGRIGR467) in domain 2 is directly across the interdomain cleft from the Walker B motif (Fig. (Fig.2A).2A). An arginine finger in motif VI of other helicases is thought to detect the γ-phosphate and to be critical to conformational switching upon NTP hydrolysis (1, 11, 35). In the flavivirus helicase, Arg 464 and Arg 467 are positioned to be the arginine finger. Two other arginine side chains in motif VI, Arg 461 and Arg 462, point away from the NTP site, traverse the hydrophobic core, and form hydrogen bonds on remote surfaces of domain 2.
Some of the helicase motifs are positioned to make specific contacts between domains 1 and 2. These contacts are expected to change during the NTPase catalytic cycle as the domains hinge open and closed relative to one another. Motifs Ia (L226APTRVVLSEM236) and V (T413DIAEMGAN421) face one another across the domain boundary, and would be in direct contact were the domains in a more closed configuration. Motif III (T320ATPPG325) is the linker between domains 1 and 2. Motif VI makes direct contact with motifs I, II, and III, in addition to providing the arginine finger.
Helicase motifs also occur at the interface of domain 2 and domain 3. Motif IV (F366LPSIR371) and part of motif V (T413DIAE417) face domain 3 and the presumed RNA unwinding site. These motifs may help translocate RNA through the helicase in response to NTP hydrolysis.
Domains 1 and 2 of the flavivirus helicase are similar to the corresponding domains of other superfamily 2 helicases (7, 10, 39, 47, 55). They are most similar to the HCV helicase domains 1 and 2 with which they have 21% sequence identity (Fig. (Fig.3A).3A). The hinge (Fig. (Fig.2A)2A) between domains 1 and 2, formed by motif III, is about 15° more closed in the YFV helicase than in the structures of the HCV helicase (Fig. (Fig.3B).3B). The structural overlap of the YFV and HCV helicases is 1.8 Å root mean square distance for 97 of 145 Cα atoms in domain 1, and 1.7 Å root mean square distance for 108 of 158 Cα atoms in domain 2.
A surprising difference between YFV and HCV helicases occurs in the Walker A motifs, which were expected to be highly similar. The structural difference is due to nonequivalent proline residues in helix α1, which follow the Walker A motif by five residues in YFV helicase (Pro210) and by four residues in HCV helicase (Pro215). The proline residues disrupt helical hydrogen bonding differently in the two proteins. The important result is that the Walker A motif and helix α1 have somewhat different positions in YFV and HCV helicases relative to the other helicase motifs and secondary structures of domain 1. In fact, the helicases of superfamily 2 have remarkably different Walker A motifs, both in their internal structures and in their positions relative to the Walker B motifs. Given that they catalyze identical phosphotransfer reactions using virtually identical amino acid residues, the two halves of the active site are expected to have identical structures in all superfamily 2 helicases at the moment of catalysis. The fact that different structures are observed suggests that motion of the Walker A motif, internally and in relation to Walker B, may be part of the catalytic cycle of superfamily 2 helicases.
A helicase mechanism was proposed for the HCV NS3 protein based on a dimeric form of the protein observed in one of the crystal structures (13). Other structures of HCV helicase were interpreted in terms of a functional monomer (26). YFV helicase crystallized as a monomer in a crystal form lacking dimer symmetry. Thus, the structure is not consistent with a dimer-based mechanism for helicase activity. A monomeric form of the YFV helicase in solution was also observed by dynamic light scattering (data not shown).
Domain 3 of the flavivirus helicase (residues 489 to 623) has a new fold which matched no other structure in the automated structure search. The structure is dominated by four α-helices (α7 to α10) and two long loops, which comprise 28 residues (503 to 530) between α7 and α8 and 49 residues (559 to 607) between α9 and α10. Domain 3 does not appear to be independently stable, due to its low content of secondary structure (<40%). The first long loop is stabilized by extensive contacts with domain 1, notably a buried charge bridge between the invariant side chains of Arg302 and Glu516 and backbone hydrogen bonds between residues 274 and 503. Domain 3 is effectively anchored to domain 1 by these specific hydrogen bonds and by extensive hydrophobic interactions between the first long loop and helix α7 of domain 3 and helix α4 of domain 1. In contrast, domain 3 contacts domain 2 only through a β-hairpin (β12-β13, residues 432 to 452) that extends away from the core of domain 2. Residues 601 and 603 in the second long loop form β-type hydrogen bonds with residue 445 in the β-hairpin. Strong conservation of the domain 3 structure is inferred from the 23 nearly invariant positions in the 135-residue domain, the 40 to 90% pairwise identity among flavivirus sequences, and the specific contacts of several conserved charged and polar side chains. Notable among these is a charge bridge, the RED bridge, between the invariant side chains of Arg604 (R), Glu494 (E), and Asp602 (D).
In contrast to the conservation of domain 3 among flavivirus sequences, the structures of domain 3 of the flavivirus and HCV proteins are sufficiently different to render structural alignment nearly impossible and sequence alignment meaningless (Fig. (Fig.3A).3A). Nevertheless, and despite the failure of an automated fold search, we find vestiges of a common ancestor for domain 3 of the YFV and HCV helicases. Three helices in domains 3 are similarly oriented (α7, α8, and α9 of YFV helicase), and 31 pairs of Cα atoms can be superimposed with an root mean square distance of 2.1 Å (Fig. (Fig.3B3B).
There is low sequence similarity among the 31 matched residues (only three identities), and the group of three helices is oriented somewhat differently in the YFV and HCV helicases with respect to domains 1 and 2 (Fig. (Fig.3B).3B). However, similar key features occur in both structures, particularly at the loop between the second (α8) and third (α9) helices and its connection to the first helix (α7). For example, a carboxyl side chain in the first helix of both proteins (invariant Glu494 in α7 of YFV helicase; Asp 496 in HCV helicase) is hydrogen bonded to the positive dipole at the N terminus of the third helix. Glu494 of YFV helicase is also part of the flavivirus-invariant RED bridge described above. Both proteins have the β-type hydrogen bonds between domain 3 and the β-hairpin extending from domain 2, as described above. Both proteins have a carboxylate side chain at the C terminus of the second helix (invariant Asp545 in α8 of YFV helicase; Glu555 in HCV helicase). These conserved interactions position invariant Asp545 so that its side chain is directed into the presumed RNA-binding cleft between domain 3 and domains 1 and 2. In the complex of HCV helicase and uridine deoxyoctanucleotide [(dU)8], this region of the structure faces the bases of the DNA oligomer (26). Thus, Asp545 is predicted to have an essential role in the helicase activity of flavivirus NS3 based on its position in the structure and its conservation.
Differences in the structures of domain 3 in the YFV and HCV helicases have implications for polyprotein processing at the NS3-NS4A junction by the viral protease. The N and C termini of the YFV helicase are on opposite sides of the protein (Fig. (Fig.3C).3C). In contrast, the N- and C termini of the HCV helicase are on the same side of the protein. Same-side localization of the HCV helicase termini allows the NS3-NS4A junction to bind in the protease active site in cis. This is demonstrated by the structure of intact HCV NS3 (56), in which the C terminus of NS3 is in the protease active site, representing the product complex of cis cleavage of the NS3-NS4A junction (Fig. (Fig.3C).3C). In contrast, the natural C terminus of YFV helicase cannot reach the protease active site, even if the 12-residue linker between protease and helicase domains is fully extended, without substantial rearrangement of either the protease or helicase structure. Thus, we predict that the flaviviral protease does not cleave the NS3-NS4A junction in cis. The HCV protease uses a cofactor from NS4A, which may bind and activate the protease concomitant with cis cleavage of the NS3-NS4A junction. Opposite-side localization of helicase termini precludes such a scheme for flavivirus NS3 and may explain the use of NS2B as a flavivirus protease cofactor.
YFV helicase domain 3 contacts domains 1 and 2 to form a groove at the domain boundary. Only two structures have been reported of helicase-nucleic acid complexes. In both structures, a DNA single strand is bound to the cleft between domain 3 and domains 1 and 2, such that the 3′ end is beneath domain 1 and the 5′ end is beneath domain 2. One structure is a complex of HCV helicase with a DNA oligomer, (dU)8 (26). The second is a complex of superfamily 1 helicase PcrA with a DNA duplex having a 3′ overhanging single strand, in which the duplex is bound along the side of the protein shared by domain 2 and the analog of domain 3 (49). Based on the common binding orientation in these complexes, we assume that an RNA single strand proceeds through the major interdomain cleft of the flavivirus helicase from the domain 2 side towards the domain 1 side of the protein (Fig. (Fig.3D3D).
A conserved tryptophan (Trp501) in domain 3 of HCV helicase stacks with one of the bases of the nucleic acid and is required for its RNA helicase activity (26, 27). However, the flavivirus helicase has no conserved aromatic residue at either end of the presumed single-strand RNA binding cleft.
Several regions of the flavivirus helicase have been associated with specific functions of the protein. Mutations in a cluster of positively charged residues preceding the first β-strand (YFV Lys 189 and Lys 190) eliminated RNA stimulation of NTPase activity in the dengue virus helicase (R. Padmanabhan, personal communication). Amino acid substitutions at Arg 513 of the dengue virus helicase (YFV helicase Gly 517) eliminated RNA triphosphatase activity (6). Residues 189, 190 and 517 all map to the outer surface of domain 1 (Fig. (Fig.4),4), between the entrance to the NTPase site and the presumed exit of single-stranded RNA from the interdomain groove (Fig. (Fig.3D).3D). Single-stranded RNA may interact with this positively charged surface of domain 1 and thereby influence events in the NTPase site.
A distant cluster of positively charged residues was implicated in RNA binding in the dengue virus helicase (32). The corresponding residues (Arg 381, Lys 382, and Lys 385) in the YFV helicase map to the end of helix α5 on the outer surface of domain 2 (Fig. (Fig.4).4). This may be a binding site for duplex RNA prior to strand separation, because it is on the same side of the helicase as the duplex binding region of PcrA, a superfamily 1 helicase (49).
A specific internal cleavage of NS3 by the viral protease has been observed for some flaviviruses (3, 37, 44) at a site corresponding to Arg462 in helicase motif VI. Arg462 is buried inside the protein in helix α6 where it should be inaccessible to proteases (Fig. (Fig.4).4). Thus, the substrate for internal cleavage at Arg462 should be less folded than the native protein, and cleaved NS3 should be unable to refold to a fully native form. If internal cleavage of NS3 has a function in the flavivirus life cycle, it may be in NS3 turnover.
Finally, mutation of YFV helicase Asp 343 to Val, Ala, or Gly suppressed the effects of an NS2A mutation that prevented cleavage of NS2A at a secondary site and formation of infectious virus (29). Asp 343 is on a surface loop in domain 2, far from both domain 3 and the NTPase active site (Fig. (Fig.4).4). Asp 343 and the linker to the protease domain are on the same surface of NS3. This surface may influence protease activity by interacting with protease substrates, such as NS2A.
In a flavivirus-infected cell, NS3 is part of a viral protein complex that replicates the genome, processes polyprotein, and initiates genome packaging. The complex includes viral protease/helicase NS3, protease cofactor NS2B, RNA-dependent RNA polymerase NS5, and perhaps other viral nonstructural proteins. The highly conserved 3′ untranslated region of the viral genome is also involved in replicase complex formation (12), which may be regulated by phosphorylation of NS5 on multiple serine residues (24). The replication complex is poorly characterized, but NS3 and NS5 are thought to make key protein contacts.
A physical complex of NS3 and NS5 was detected by immunoprecipitation of proteins from both dengue virus type 2 and Japanese encephalitis virus (12, 24). The NS3-NS5 interaction region was localized to 300 residues at the C terminus of NS3, corresponding to domains 2 and 3 of the helicase (22), and to 40 residues within NS5, between the N-terminal methyltransferase and C-terminal RNA-dependent RNA polymerase domains (22). The interdomain region of NS5 also contains two nuclear localization signals (22), consistent with observations of NS5 alone in nuclei of infected cells (17). The 40-residue NS3 binding site overlaps the nuclear localization signal region (22), leading to the possibility that NS3, which is not found in host cell nuclei, may affect the nuclear localization of NS5.
The analogous proteins from HCV interact differently. The HCV RNA-dependent RNA polymerase (NS5B) lacks an nuclear localization signal, is not found in host cell nuclei, and interacts not with the helicase region but with the N-terminal protease region of NS3 (21). The surprisingly different lifestyles of the flavivirus and HCV proteins parallel the strikingly different structures for domain 3 of their NS3 helicases. In particular, the C-terminal region of the flavivirus domain 3 has a large protein surface that is remote from sites associated with the primary catalytic functions of NS3 and we predict is an interaction site for NS5 (Fig. (Fig.3C).3C). The key to understanding the molecular determinants of viral RNA replication will require a description of unique protein-protein interactions such as those that are suggested here.
The yellow fever virus helicase structure and our main conclusions about flavivirus are consistent with the structure of the dengue virus helicase presented in the accompanying paper (T. Xu, A. Sampath, A. Chao, D. Wen, M. Nanao, P. Chene, S. G. Vasudevan, and J. Lescar, J. Virol. 79:10278-10288, 2005).
We thank Richard Kinney for kindly providing pWN-CG, encoding the West Nile virus helicase, Charles Rice for pACNR/FLYF, encoding the YFV helicase, R. Pad Padmanabhan for sharing unpublished results on the dengue virus helicase, and staff members of the Advanced Photon Source beamlines used in this work, SBC CAT and GM/CA CAT.
This work was supported by NIH grant P01 AI-055672 to R.J.K. and J.L.S.