|Home | About | Journals | Submit | Contact Us | Français|
The RNA polymerase (gene product NS5B) from the hepatitis C virus is responsible for replication of the viral genome and is a validated drug target for new therapeutic agents. NS5B has a structure resembling an open right hand (containing the fingers, palm, and thumb subdomains), a hydrophobic C-terminal region, and two magnesium ions coordinated in the palm domain. Biochemical data suggest that the magnesium ions provide structural stability and are directly involved in catalysis, while the C-terminus plays a regulatory role in NS5B function. Nevertheless, the molecular mechanisms by which these two features regulate polymerase activity remain unclear. To answer this question, we performed molecular dynamics simulations of NS5B variants with different C-terminal lengths in the presence or absence of magnesium ions to determine the impact on enzyme properties. We observed that metal binding increases both the magnitude and the degree of correlated enzyme motions. In contrast, we observed that the C-terminus restricts enzyme dynamics. Under certain conditions, our simulations revealed a fully closed conformation of NS5B that may facilitate de novo initiation of RNA replication. This knowledge is important because it fosters the development of a comprehensive description of RNA replication by NS5B and is relevant to understanding the functional properties of a broad class of related RNA polymerases such as 3D-pol from poliovirus. Ultimately, this information may also be pertinent to designing novel NS5B therapeutics.
The hepatitis C virus (HCV) has infected 170 million people worldwide, and approximately 4 million people within the United States.1,2 There is no cure for HCV infection, and 25% of infected individuals contract chronic liver ailments such as cirrhosis or liver cancer.1,3 The current standard of care, ribavirin and pegylated interferon, is not effective across the six genotypes of HCV and can have severe side effects.4 Thus, better treatments are sorely needed. The RNA polymerase from HCV (gene product NS5B) is a validated drug target because it is crucial for replication of the viral genome.1,5–11 Consequently, it is important to understand the mechanisms by which the activity of this enzyme is modulated.
Among the components of NS5B associated with changes in activity are the C-terminal residues and the presence of magnesium ions. NS5B has the right-handed structural organization into fingers, palm, and thumb domains that is typical of viral RNA polymerases.8,12 NS5B contains 591 residues; the last 60 (residues 532–591) hydrophobic C-terminal residues are thought to be associated with the membrane of the endoplasmic reticulum (ER) in vivo and cause the enzyme to be insoluble in vitro.13,14 Consequently, biochemical studies often make use of NS5B variants in which the C-terminus is truncated.14,15 This issue also impacts structural studies, with most of the available crystal structures generated for truncated versions of the enzyme. However, NS5B with C-terminal truncations is replication competent and exhibits RNA polymerase activity.13,14 In vitro studies indicate that the presence of the C-terminus decreases enzyme activity.14 In the absence of membrane association, residues 532–544 wrap around the thumb domain while residues 545–562 occupy the interface between all three domains, interacting closely with several residues from the fingers and thumb (Figure 1).14,15
We note that this conformation of the C-terminus of the enzyme occurs in the crystal structures used to initiate our present simulation studies (see Materials and Methods) and may also occur in other in vitro situations. Many structural studies and activity assays of NS5B are conducted under conditions where this conformation of the C-terminus is possible. In addition, much of what we currently know about the enzyme’s structure and function is derived from such studies. Thus, it is important to understand how this conformation of the C-terminus modulates the structure and function of the enzyme.
Apart from C-terminal residues, NS5B activity is also modulated by the presence of magnesium ions, which are required for RNA replication in vitro.16–18 Structural studies indicate two magnesium ions can be coordinated in the palm domain. The metal ions stabilize the enzyme structure and create a favorable geometry of the active site for the initiation of replication.16,17 The magnesium ions are coordinated by three aspartic acid residues (220, 318, and 319) and are believed to be directly involved in catalysis.16,17
NS5B is thought to replicate RNA in two stages: initiation and elongation.19 These stages are associated with the closed and open conformations of the enzyme, respectively.20,21 However, while crystal structures of both conformations have been seen in HCV subtype 2a [Protein Data Bank (PDB) entries 1YUY and 1YV2], only the closed conformation has been seen for subtype 1b.22 In the closed conformation, the fingers and thumb domains are rotated toward each other, positioning residues in the template channel for initiation. During elongation, the thumb domain rotates away from the fingers domain in the open conformation, opening the exit channel for nascent RNA at the front of the enzyme.23 Transitions between the open and closed conformation are thought to be regulated by the Δ1 loop (residues 20–35).23 Chinnaswamy and associates deleted the tip of the Δ1 loop (residues 26–30) and used electron microscopy to confirm that NS5B remained in the open conformation and did not sample the closed conformation.23
To date, crystallographic and biochemical experiments have provided a wealth of information about the structure and activity of NS5B. However, these studies are unable to provide a molecular level understanding of the roles the different components of the enzyme play in modulating activity. The structural intermediates involved in RNA replication have not been fully determined, and much about the process of replication remains unclear, such as the role that the C-terminus or metal ions play in altering the structure and dynamics of the enzyme. Furthermore, the conformational changes necessary for RNA replication suggest that specific motions associated with conformational transitions are required for NS5B to function.
In this article, we employ molecular dynamics (MD) simulations to understand how the C-terminus of the enzyme and the presence of magnesium ions influence the structural and dynamic properties of NS5B. We performed MD simulations of NS5B with different C-terminal lengths (531, 562, and 563 residues), both with and without magnesium ions bound (see Table 1 for a description of each system). We observed that the absence of C-terminal residues in conjunction with the presence of magnesium ions induced the enzyme to exhibit a new conformation that is more closed than that observed in the crystallographic coordinates or in our previous simulation studies.24 This observation suggests that the crystal structures may not represent the most closed conformation of the enzyme, as originally thought. In our previous studies, we described a “hyper-closed” conformation observed for a structure of NS5B containing an allosteric inhibitor bound to the thumb domain. However, the closed conformation observed here is different from that observed in our previous work.24 We do not believe that the hyper-closed conformation described for the ligand-bound enzyme allows the enzyme to be active.24 In contrast, we provide evidence that the fully closed conformation observed in this study is a functionally relevant conformational state. This enzyme conformation may represent the initial state to which substrates bind to begin one round of replication of the linear HCV genome. Our studies suggest that the C-terminal residues prevent the achievement of a completely closed conformation and thus likely inhibit the initiation of replication. The results of this study help us to understand how the activity of NS5B is regulated, information that is relevant to illuminating the function of related viral polymerases such as the 3D-pol from poliovirus and to identifying new ways to inhibit NS5B.
Crystal structures of genotype 1b HCV NS5B from the PDB with the following PDB entries were used as the starting coordinates: 2WHO, 2HAI, 3CO9, and 3HHK. 2WHO lacks the C-terminus and contains 531 amino acids. 2HAI and 3CO9 both have 562 residues, and 3HHK has 563. 2HAI also has three point mutations: L47Q, F101Y, and K114R. Each structure contains an allosteric inhibitor bound to a different location of the enzyme. Both 2WHO and 2HAI contain a non-nucleoside inhibitor in the NNI2 thumb binding site at the base of the thumb, ~30 Å from the active site. 3CO9 and 3HHK contain inhibitors in the NNI3 and NNI4 binding sites, respectively, located in the palm domain adjacent to the active site. Inhibitors were removed from each structure before simulations were begun.
These structures were chosen to represent diverse NS5B constructs and initial coordinates, and we are in the process of simulating these constructs with their inhibitors present to determine the effect of allosteric inhibitors in the various binding sites. In addition, the 2WHO structure contains two manganese ions bound to the divalent ion site and was used for placement of the magnesium ions in each structure. The manganese ions were replaced with magnesium and the metal ions placed in the other constructs via alignment with 2WHO. Simulations were conducted with enzymes in both apo and metal-bound forms.
All structures were prepared for simulation by removing nonprotein atoms from the coordinates and adding hydrogen atoms using GROMACS version 126.96.36.199 In cases where the crystal structure contained more than one chain in the unit cell, chain A was selected for simulation. Each structure was placed in a truncated octahedral unit cell that was larger than the protein by 12 Å in each dimension, and explicit SPC water26 molecules were added. To neutralize the system, 15 chloride ions were added to the apoenzyme systems and 19 chloride ions to the enzymes containing magnesium ions. The total number of solvent and solute atoms present in each system is provided in Table 1. The OPLS/all-atom force field27 was employed to describe inter- and intramolecular interactions.
Minimization of the solvated protein coordinates was conducted for 50000 steps using the steepest descent algorithm and applying periodic boundary conditions. All covalent bonds to hydrogen atoms were constrained using the SHAKE algorithm,28 and electrostatic interactions were calculated by the particle mesh Ewald method (PME).29 Ten angstroms was used as the Coulomb cutoff and the nonbonded pair list.
This was followed by equilibration in the NPT ensemble for 3–5 ns using a 2 fs time step, during which Parrinello–Rahman coupling was applied to maintain the pressure at 1.01 bar. Once the volume of the unit cell had stabilized, a snapshot of the NPT equilibration in which the pressure was closest to 1.01 bar was written out. Further MD simulations were performed using the NVT ensemble for 400–700 ns. The v-rescale thermostat was used to maintain a temperature of 300 K. The last 200 ns of equilibrated data was used for data analysis. Snapshots of the MD simulations were collected every 10 ps for data analysis, and VMD was used to view the resulting trajectories.
During the NVT simulations of 3CO9, the pressure rose above 1.01 bar. The secondary and tertiary structures of the enzyme were maintained upon visual inspection. Thus, another NPT simulation was conducted to re-equilibrate the pressure. NVT simulations were then conducted as described above.
Root-mean-square fluctuations (rmsfs) were calculated for Cα atoms to determine the flexibility of each residue during the equilibrated simulations (eq 1). After each snapshot had been superimposed onto the initial minimized structure, the reference position of each atom (l) was subtracted from the instantaneous position of the atom in each snapshot of the trajectory [xi(tj)]
where tj is the time of the jth snapshot, dt is the frequency at which coordinates were written, and T is the total simulation time.
Covariance matrices (eq 2) were calculated using the “covar” utility in GROMACS. The covariance matrix allows the viewing of correlated motions sampled by atoms i and j during the MD trajectory. The normalized covariance between two atoms (Cij) is determined by taking the product of the difference between instantaneous positional coordinate ri or rj and average position ri or rj, where r includes the x, y, and z directions:
The covariance matrix is normalized as shown in eq 2 so that atoms with completely positively correlated motion display Cij values of 1, while atoms with completely negatively correlated motions display Cij values of −1. If two atoms are not correlated or move in orthogonal directions, their Cij values will be zero. Cα atoms were selected for this analysis to represent the overall motion of each residue. The covariance matrix was viewed using the “colormap” utility in MATLAB.
PCA diagonalizes the covariance matrix (Cij) to generate eigenvalues (λ), using the eigenvector matrix (V) (eq 3).
Together, the eigenvalues and eigenvectors comprise the principle components (PCs) of the protein motion. The eigenvectors describe the direction of the motions sampled during the trajectories, while the eigenvalues describe the magnitude of these motions. The PCs are analogous to vibrational modes from a normal mode analysis.30 Typically, the first PC (λ1) accounts for most of the motion, while the second (λ2) accounts for a smaller amount, etc.31 These modes can be used to identify the inherent fluctuations present in the trajectory.
Projections of the trajectories unto the PCs were conducted to visualize the conformational space being sampled by the enzyme during the simulations. These were done using the “anaeig” utility in GROMACS. Protein coordinates from the trajectories, x(t), were compared to a reference structure, x, and then projected onto eigenvectors VT from PCA in eq 4.25 The value of P is zero when the coordinates in x(t) are identical to x.
We chose to project our trajectories onto two vectors (PCs 1 and 2) from the simulation of 1b, because they represent the largest amplitude fluctuations observed in the trajectories. In addition, because 1b contains magnesium ions but not the C-terminus (see Discussion), we believe these motions are most likely to be functionally relevant.32,33 These PCs include 400 ns of data representing both the intermediate and fully closed conformations sampled by this enzyme. Root-mean-square inner product (rmsip) values are employed to evaluate the degree of similarity between PC vector spaces obtained from each simulation. rmsip values demonstrate that there is little similarity between the lowest-frequency eigenvectors representing the intrinsic motions from different systems. To perform both the projections and the rmsip calculations, we truncated the systems containing C-terminal residues to 531 residues to match 1 (see Table 1). For the remainder of the paper, use of a single number denotes both trajectories initiated from this structure; i.e., 1 indicates both 1a and 1b.
The solvent accessible surface area (SASA) was calculated for the residues lining the template channel (14, 15, 93–98, 137–141, 158–162, 168, 224, 225, 269, 282–291, 317, 318, 404, 405, and 444–451) as a measure of the accessibility of the template channel to the environment. These residues were identified using crystal structure 1NB7,34 which contains a four-nucleotide fragment of RNA template. Any amino acids that were <5 Å from any nucleotide atom in crystal structure 1NB7 were chosen for the SASA calculation during the simulation. The “sas” utility in GROMACS was used for this calculation, and a probe radius of 1.4 Å was employed.
The angle between the domains was determined by calculating the angle between the two vectors connecting the centers of masses of (i) the palm and fingers domains and (ii) the palm and thumb domains. The residues in each domain were selected according to the definition of Lesburg et al. as follows: fingers, 1–188 and 227–287; palm, 189–226 and 288–370; thumb, 371–529.35
Table 1 describes the systems that were simulated, while Table 2 describes other relevant enzyme structures that will be discussed throughout this work. The last 200 ns of equilibrated data was used for data analysis of each system. The structures containing C-terminal residues equilibrated around 3 Å root-mean-square deviation (rmsd) from the initial minimized structure, while 1 (which lacks C-terminal residues) equilibrated around 4 Å (see Figure 7). The rmsd plots for systems containing C-terminal residues are provided in SI-1 of the Supporting Information. Data analysis suggests that the presence of the C-terminus suppresses the impact of magnesium ions on the enzyme’s properties when both were present. Therefore, we will focus on data for 1, the structure lacking the C-terminus, to demonstrate the full effects of the magnesium ions, as the effect of metal binding was most pronounced when these systems were compared. In addition, system 1b revealed a new conformation, which we will term “fully closed”.
The C-terminus lodges between the fingers and thumb domains, interfering with communication between the two domains (Figure 1). We observed interactions in our simulations consistent with contacts observed by Adachi et al.14 However, because of the extreme flexibility of the C-terminus, we also observed additional interactions not noted previously. In particular, for 4, we observed an additional hydrogen bond between the hydroxyl H of Y176 and the hydroxyl O of S563 (3.3 ± 1.57 Å). Overall, the interactions between the C-terminus and the fingers and thumb domains restrict the motion of the entire enzyme.
A comparison of the summed eigenvalues for our simulated structures is given in Table 3 and shows that the presence of the C-terminus decreases the overall enzyme flexibility. The structure of 1b is also subdivided into separate intermediate and fully closed sections. If one combines these two sections into one unit, 1b displays the largest amount of conformational sampling of all the simulated systems. While 2 and 3 have the same number of residues, the C-terminus has a different effect on enzyme flexibility in the two cases. 2 is less flexible than 3, which may result from the mutations in 2 (see Table 1). These mutations exchange residues with smaller side chains for larger side chain residues; this may cause steric hindrance and contribute to the enzymes’ decreased flexibility. Structure 4 samples the least conformational space, which is likely due to the hydrogen bond between Y176 and S563 that is noted above imparting additional structural rigidity. Note that 3 has the same sequence as 4 but contains only 562 residues. It is the missing C-terminal serine (S563) that participates in the indicated hydrogen bond in 4. Thus, 3 serves as a control for the effect of this hydrogen bond in 4. By inspecting Table 3, one can see that 3 does in fact display greater flexibility than 4.
Projecting the protein coordinates on the two lowest-frequency PCs from 1b shows that enzymes with C-terminal residues (2–4) explore less of the conformational space represented in these PCs than either 1a or 1b (Figure 2). This is consistent with the decreased enzyme flexibility observed for these simulations (Table 3). This observation demonstrates the ability of the C-terminus to inhibit conformational sampling. The plots also demonstrate that enzymes containing the C-terminus are in conformations very similar to the initial crystal structures and unlike those sampled by 1b. A table of root-mean-square inner products (rmsips) and the percentage of cosine content for the 1% of the lowest-frequency modes (15/1593) are provided as SI-2 and SI-3 of the Supporting Information, respectively. The rmsip values indicate that the low-frequency modes sampled in each of the trajectories differ substantially, while the low cosine content observed indicates these modes in general do not simply represent random diffusive motions.
Covariance analysis was used to identify correlated motions occurring in the equilibrated enzyme simulations (see Figure 3). We will focus on the correlations observed for structures 1b, 2b, 3b, and 4b, as we expect the metal-bound enzymes to be the most functionally relevant. The disorder and overall flexibility of the C-terminus cause it to associate with different parts of the enzyme in each system. This is demonstrated by the varying patterns of correlation experienced by the C-terminus in each system.
The most abundant motions in simulation 1b are anticorrelated motions between the fingers and thumb domains. These motions make up the bulk of the first PC of the trajectory. Visualization of the first PC shows that the anticorrelated motion between the fingers and thumb domains involves the top of these domains moving toward and away from each other over the course of the simulation. A correlation map of the motions observed during PC1 is shown in SI-4 of the Supporting Information to demonstrate that this PC is largely responsible for this pattern of correlations. Over time, the enzyme structure becomes more closed, as indicated in Figure 2. The closed enzyme conformation likely favors initiation of RNA replication.23
Structures 2b and 3b display the most extensive pattern of correlated motions overall. In particular, they both exhibit more anticorrelated motion between the fingers and thumb domains than does 1b. While the two domains still move toward or away from each other, the anitcorrelated motions are between different residues from the fingers and thumb domains compared to the motions observed for 1b. For 2b, thumb residues 440–455 (also known as the β-loop) strongly interact with the C-terminus, as do residues from the fingers domain. Thus, the β-loop is forced to move in concert with the fingers domain. This loop has been associated with the correct positioning of the RNA template.36 If the β-loop is otherwise engaged by interacting with the C-terminus, it may not be available to coordinate the incoming template RNA strand. The first mode for 2b shows motions throughout the entire enzyme, with most of the motion coming from the early part of the C-terminus (residues 532–544). There is also a shift in the enzyme’s conformation, as the bottom halves of the fingers and thumb domains move toward each other.
3b experiences slightly different anticorrelated motions between the fingers domain and the palm and thumb domains compared to those of 2b, as seen in Figure 3. The lowest-frequency PC for 3b indicates that the C-terminus is extremely flexible and moves in concert with the fingers domain. The fingers and thumb domains are tightly associated with the latter part of the C-terminus. As the trajectory progresses, the domains become wedged apart even more by a cranking motion of the early C-terminal residues. Overall, this causes the enzyme conformation to become more open.
Finally, 4b does not display widespread correlated motions and is also the least flexible. This is due to the hydrogen bond between the additional C-terminal residue S563 and residue Y176 noted previously. This additional interaction increases the rigidity of the enzyme, decreasing the magnitude of fluctuations that occur as indicated by the summed eigenvalues (Table 3). Most of the motion represented in the lowest-frequency PC of 4b occurs in the rear of the fingers domain in residues 140–160 (Δ2 loop). The C-terminus along with the rest of the enzyme is seen to be extremely rigid.
Overall, the motions observed for structures containing C-terminal residues differ greatly from those of 1b. Although 2b and 3b demonstrate anticorrelated motions between the thumb and fingers domains, these motions appear to be governed by the fluctuations of the C-terminus, which interacts with both domains. As previously stated, we expect that the distinct pattern of anticorrelated motions observed between the fingers and thumb domains for 1b is most likely to represent functional motions because 1b lacks C-terminal residues and these residues are known to decrease enzyme activity in vitro.14
Figure 4 shows the electrostatic potential on the front surface of NS5B for enzymes 1* (which does not possess C-terminal residues) and 3* (which does contain a C-terminus). The template and duplex channels have a very positive potential (Figure 4A), while the C-terminus is extremely negative (Figure 4B), causing the two to strongly interact. The negative potential from the C-terminus near the template and duplex channels may create a less favorable environment for RNA template binding and duplex RNA exiting the enzyme (Figure 4C) by destabilizing the negative phosphate groups on the RNA backbone. We note that the presence of the C-terminus by itself does not preclude template binding, as our simulations suggest the template channel is wide enough to accommodate a template strand even if the C-terminus is present. Consistent with this finding, crystal structure 1NB7 containing a short template strand as well a C-terminal residues demonstrates that both can occupy the template channel simultaneously.34 However, in 1NB7, the C-terminus interacts directly with the template: we suggest that such interactions may inhibit replication by destabilizing interactions between the template and the remainder of enzyme and preventing the template from reaching the active site. This phenomenon may reflect yet another way in which the C-terminus decreases enzyme activity.
There are immediate structural effects that occur upon metal binding. The catalytic triad (Asp220, -318, and -319) coordinates the magnesium ions within the active site. In the absence of the metal ions, these residues are free to interact with other residues. Figure 5 shows how residues within and around the active site are affected upon metal ion binding. In 1a, Oδ1 of Asp220 and Hε of Arg222 are observed to rotate toward each other and share a hydrogen bond. Similarly, Oδ2 of Asp319 flips toward Hα of Cys366 to form a hydrogen bond. Finally, Asp318 appears to freely sample several conformations rather than interacting with one specific residue. These interactions subsequently alter the dynamics of the enzyme. However, they do not occur in 1b when the metal ions are present. Projecting trajectory snapshots on principal components (PCs) extracted from the 1b trajectory (see Figure 2) demonstrates that when metal is bound (1b) the enzyme samples conformations distinct from those sampled in the absence of metal (1a). In addition, 1b samples more conformational space along these PCs than 1a, which is consistent with the increased level of conformational sampling seen for 1b (see Table 3).
To highlight the differences in flexibility between 1a and 1b, an rmsf difference plot comparing the two is shown in Figure 6. Standard errors for these values are provided in SI-5 of the Supporting Information. For 1a, several regions in the fingers domain exhibit increased flexibility, with limited flexibility in the palm domain and a rather rigid thumb domain. In particular, the Δ2 loop (residues 145–155) appears to be extremely flexible compared to the rest of the enzyme. In contrast, for 1b we observe moderate flexibility in the fingers domain, a rigid palm domain, and an extremely flexible thumb domain. In comparison to that of 1a, the Δ2 loop of 1b is less flexible; this loop (also known as motif F) is thought to be involved in nucleotide triphosphate (NTP) and template binding during replication.37 Therefore, flexibility in this region may be necessary for the Δ2 loop to function. However, the extreme flexibility seen in 1a could be detrimental, preventing the loop from being properly positioned for binding to NTPs and template nucleotides. The Δ1 loop (residues 25–35), thought to regulate the opening and closing of the enzyme, is less flexible in 1a than in 1b. Flexibility in this region would be necessary for the enzyme to switch between open and closed conformations, as observed in 1b. Thus, the decreased flexibility of these residues observed in 1a may indicate this system is less able to convert between open and closed conformations. The rmsf difference plots for systems 2–4 are provided in SI-6 of the Supporting Information.
At first glance, the correlated motions in 1a and 1b appear to be very similar, but there is, in fact, a distinct change in correlation upon metal binding (Figure 3). The motions for 1b were discussed previously. 1a experiences anticorrelated motions between residues spread over the fingers and palm domains (residues 50–100, 145–170, and 220–360) and the thumb domain (residues 390–520). Visualization of the lowest-frequency PC reveals that the fingers and thumb domains are not moving toward and away from each other as in 1b but are instead moving up and down in an anticorrelated fashion. This is the dominant enzyme motion seen for 1b. Overall, the absence of the metal ions decreases the extent of domain communication [with fewer hydrogen bonds between the fingers and thumb domains (see the next section)] and alters the directions of motion, which may weaken the ability of the enzyme to convert between open and closed conformations.
The rmsd plots for 1 (Figure 7) and other analyses suggest major structural changes occur during the course of the simulations. We analyzed the final 200 ns for 1a and the 200 ns equilibrated section labeled “intermediate” and the 190 ns equilibrated section labeled “fully closed” for 1b to determine the nature of these changes. The average structures for these trajectory segments reveal that 1b makes a conformational change from the “closed” conformation seen in 1* to a conformation we term fully closed. In contrast, 1a does not display this conformational change.
Figure 8A shows magnified views of 1*, intermediate 1b, and fully closed 1b at the intersection of the fingers and thumb domains. The residues shown as licorice begin to interact as the enzyme undergoes the transition to the fully closed conformation. Two hydrogen bonds are necessary to stabilize the fully closed conformation: (i) between Hγ1 of T287 and the carbonyl O of Y448 (thumb β-loop) and (ii) between the carbonyl O of H402 and Hε1 of H95. The “g_hbond” utility in GROMACS indicates that these hydrogen bonds possess lifetimes of 774 and 583 ps, respectively. The first is located near the lower region of intersection between the fingers and thumb domains, while the other is found at the top of the fingers and thumb domains. There is also a weak electrostatic interaction between the carbonyl O of H95 and Hβ1 of P404. A plot of the hydrogen–acceptor distances and donor–hydrogen–acceptor angles for this interaction is provided in the Supporting Information (SI-7). 1* has none of these interactions, as none of the distances between these residues are short enough for hydrogen bonding to occur. Intermediate 1b displays the first hydrogen bond, with the other residues approaching each other more closely compared to those in the X-ray structure. Finally, fully closed 1b has both hydrogen bonding interactions and additional weak electrostatic interactions. Figure 8B is a plot of the domain angle over time for the intermediate and fully closed structures of 1b. The plot shows that the domain angle for the intermediate conformation fluctuates around 67°, while the domain angle for the fully closed conformation, which decreases abruptly at 200 ns, fluctuates around 63°. In contrast, 1* has a domain angle of 71°.
To improve our understanding of which regions of the protein are involved in the transition, we performed covariance analysis on the 400 ns of data for 1b that included the intermediate and fully closed structures (SI-8 of the Supporting Information), using the transition structure (average structure from 490 to 500 ns) as the reference structure. Regions displaying intense “correlation” as a result of this analysis reveal which parts of the enzyme move the most as a consequence of the structural transition. The correlation map reveals significant changes involving the relative position of the Δ1 loop with respect to the fingers and thumb domains, as well as an altered relative conformation of the fingers and thumb domains. These structural changes correspond to the enzyme conformation becoming more closed as the fingers and thumb domains approach each other and suggest an important role for the Δ1 loop in this process.
Structure 1a does not experience this conformational change and displays none of the aforementioned interactions. However, the average structure from the last 200 ns of 1a indicates that the fingers and thumb domains have moved closer together compared to those in 1*, suggesting that the crystallographic coordinates are inherently more open than the enzyme structure in solution.
The closed and open conformations of NS5B are associated with efficient de novo initiation and the exit of double-stranded RNA from the duplex channel during elongation, respectively.20 As previously mentioned, crystallographic coordinates displaying an open conformation have not been reported for HCV genotype 1b. However, we believe that 1* is in a conformation more suitable for elongation (i.e., it is more open), while our fully closed structure is more conducive to de novo initiation. To test this hypothesis, we probed template and duplex channel widths using the distance between Cα atoms of residues on either side of each channel. For the template channel, we used residues 139 and 405, while for the duplex channel, we used residues 96 and 406. In our previous study, we used residues 14 and 96 to probe the template channel;24 however, we found that those distances are not representative of the template channel width in these studies because these residues do not remain in the interior of the enzyme as it undergoes a change in conformation.
It is thought that a narrower template channel is necessary for NS5B to identify and interact with the template for replication to begin.20 Once initiation has taken place, the template and duplex channels become wider to allow the nascent RNA to exit the enzyme. The template channel widths are 20.50, 17.43, and 12.20 Å for 1*, 1a, and 1b, respectively, while the corresponding duplex channel widths are 17.40, 13.16, and 10.33 Å, respectively. In comparison, the open conformation of NS5B from HCV genotype 2a (PDB entry 1YV2) has a template channel width of 20.10 Å and a duplex channel width of 17.30 Å. The agreement between values for 1YV2 and 1* indicates that 1* is more structurally similar to an open conformation than the closed conformation commonly suggested. This observation indicates that 1* can readily accommodate double-stranded RNA but is not in an initiation-ready conformation, as the template channel is too wide.
To more fully examine the generality of the observed correlation between duplex channel width and accommodation of duplex RNA, we also measured the duplex channel of the closely related 3D-pol from poliovirus. Because of the difference in sequence between NS5B and 3D-pol, it was necessary to perform a structural alignment of the two proteins to identify residues 113 and 412 corresponding to NS5B residues 96 and 406, respectively. The alignment was performed between 1* and the 3D-pol with PDB entry 3OL638 (which contains duplex RNA exiting the enzyme) using the “multiseq” utility in VMD.39 The width of the channel in 3OL6 was found to be 18.6 Å, consistent with that observed in the open conformation present in 1YV2 and 1*. This is additional evidence that the structure of 1* is consistent with an open conformation.
As previously discussed, we believe that 1b is most suitable for de novo initiation, as it exhibits a fully closed conformation. Widths of the template and duplex channels indicate that the C-terminus prevents the enzyme from becoming sufficiently closed to initiate replication, as template channel widths for 2b–4b are significantly wider than those seen for 1b. Figure 9A shows a plot of the template and duplex channel widths for average structures from each protein system as well as for the original crystal structures (1*–4*) and X-ray structures of the open and closed conformations of NS5B from genotype 2a (PDB entries 1YV2 and 1YUY, respectively). The original PDB structures are clustered together and thus share similar conformations. Performing simulations induces the structures to become more closed as indicated by narrower template and duplex channels. However, closure is hindered by the presence of the C-terminal residues lodged between the fingers and thumb domains. In contrast, without interference from the C-terminus, 1b is able to achieve a fully closed conformation with both channels becoming narrow.
Figure 9B shows a plot comparing the angle formed from the centers of mass of the fingers, palm, and thumb domains and the SASA for residues lining the template channel. The domain angle represents the degree of closure; i.e., smaller angles indicate a more closed enzyme. 1* has the largest domain angle (71°), while 1b has the smallest angle (63°). This change in angle indicates the conformational change from a more open conformation to the fully closed conformation. The structures containing C-terminal residues have values ranging from 66° to 68°, which resemble the domain angle for intermediate 1b. This plot also shows 1b has the largest SASA even though it has the smallest angle between domains. This increase in SASA occurs because residues lining the template channel become exposed to the center of the enzyme and increase their accessibility to solvent. This could be a consequence of these residues getting into place for de novo initiation in the closed conformation. In addition, 1v from our previous study, which we described as hyper-closed, has the smallest domain angle but only a moderate SASA value. This hyper-closed conformation may not be amenable to de novo initiation, as the increased SASA seen in 1b may represent an increased capacity for the enzyme to receive and interact with an incoming template. This observation is consistent with the fact that 1v is derived from a simulation of NS5B bound to an allosteric inhibitor in the thumb domain and thus likely represents an inactive enzyme state. Overall, Figure 9 demonstrates that the original X-ray structures are more open than 1b, which we believe represents the most functionally relevant state of the enzyme.
We have shown that the binding of metal ions, as well as the presence of the enzyme’s C-terminus, has a global influence on the structure and dynamics of NS5B. The direct interactions between the C-terminus and the fingers and thumb domains restrict the motion of those domains. This diminishes enzyme flexibility and disrupts communication between the two domains, preventing the enzyme from sampling conformations that are likely to be functionally relevant. We observe that the presence of the C-terminus prevents the enzyme from achieving a fully closed conformation even when metal ions are bound. In addition, the interactions between the C-terminus and the β-loop in the thumb domain (residues 450–455) may prevent this loop from conducting its proposed function of guiding the template to the active site.36
We note that the C-terminus of NS5B may adopt a different conformation in vivo, where it is expected to be membrane-associated and thus unable to associate with the remainder of the enzyme. Consequently, the conformation of the C-terminus described in these studies is likely to be most pertinent to in vitro systems, such as those often used to study the structure and RNA polymerase activity of NS5B. Other conformations of the C-terminus in vitro seem unlikely: this region of the enzyme is highly hydrophobic and would not be anticipated to be solvent-exposed in solution. Consequently, it is more probable that the C-terminus would be found in a conformation in which it is sequestered from solvent, consistent with its location in the cleft between the fingers and thumb domains in the available crystal structures. Thus, a truncated enzyme such as 1 is our best model for reproducing the impact of the C-terminus being membrane-associated as it is expected to be in vivo.
Experimental studies have shown that the C-terminal residues inhibit NS5B activity, though the specific mechanism by which this occurs is unclear.13,14 Our simulations are consistent with these data and illustrate possible molecular mechanisms of this inhibitory effect. Moreover, the different enzyme sequences studied in this work, such as 2HAI, illustrate that these observations are robust with respect to sequence variation. This consideration is important given the strong propensity of the HCV genome to mutate because of the error-prone nature of NS5B. Overall, our studies suggest that C-terminal residues are not present between the fingers and thumb domains during replication in vivo because they prevent NS5B from achieving a fully closed conformation that would facilitate the initiation of RNA replication.13–15 Because of the dramatic impact of the C-terminus on enzyme properties, we suggest that those seeking to conduct simulation studies of this enzyme remove C-terminal residues to allow the protein to explore more conformational space and exhibit dynamics that is more likely to be functionally relevant (see below).
Divalent metal ions such as magnesium are directly involved in catalysis, and previous studies have shown that metal ion binding creates a more structurally stable enzyme.16,17 In this study, we are able to determine how this occurs at the molecular level. In addition, we show that the metal ions do more than provide stability: they also impact the global structure and dynamics of the enzyme and thus induce long-range, allosteric effects. Overall, metal ion binding increases the flexibility of the enzyme, allowing the enzyme to sample additional conformations, including a newly observed fully closed conformation. We believe this fully closed conformation facilitates de novo initiation for two reasons. First, the template channel is narrower, facilitating the interactions required for initiation. Second, there are more residues exposed to the surface of the template channel, putting them in position to receive and guide the template to the active site.
In previous studies, we identified a ligand-bound NS5B (1v) as having a hyper-closed conformation compared to that of its ligand-free form.24 However, the fully closed conformation displayed by 1b in the study presented here differs from the previously observed hyper-closed conformation. The ligand-bound 1v structure was previously shown to occupy a distinct conformational space from the ligand-free (1f) structure. We compare 1v and 1f to average structures from our current simulations in Figures 2 and and88 using a number of different metrics. The data show that 1v from our previous study is unlike any of the PDB structures or our currently simulated systems, supporting our previous suggestion that this ligand-bound structure displays an inactive conformation that is not on the pathway of functional conformational transitions. This observation is consistent with the induced fit model of allosteric inhibition we previously suggested.24 Such a model would indicate that ligand binding induces a protein conformation that differs from preexisting conformations of the unbound protein.40,41 We note that the current simulations are much longer and thus are likely to have sampled more of the underlying conformational space than our previous studies. We also note that the prior studies were conducted using the CHARMM 22 force field and CHARMM simulation program, which differs from the OPLS-AA force field and GROMACS simulation program used in the work presented here. While these differences may affect the details of our observations, the fact that we see similar trends in both studies (e.g., similar patterns of correlated motion in 1b) suggests that our overall conclusions are robust.
Our simulations also share similarities with computational studies performed by Moustafa et al.42 They performed MD simulations of several picornaviral RNA polymerases (poliovirus, coxsacki virus, foot and mouth disease virus, and bovine viral diarrhea virus), as well as a G64S mutant of the polioviral polymerase. NS5B is structurally and functionally related to these enzymes. Although picornaviral RNA polymerases have ~100 fewer residues than NS5B and lack a C-terminus, they share the right-handed organization common to many viral polymerases, including the fingers, palm, and thumb domains. Overall, Moustafa et al. observed patterns of correlated motions that appear to be conserved across the picornavirus family and are similar in general to those we observe for metal-bound NS5B, except that we observe more intense correlated motions. Via structural alignment, we determined that the general patterns of flexibility and anticorrelated motions observed between the fingers and thumb domains of NS5B by Moustafa et al. are analogous to those observed in our previous study and are also similar to those seen in 1b. These observations indicate that RNA-dependent RNA polymerases from the different viruses may experience similar equilibrium motions, suggesting that their dynamics are conserved.24,42 Thus, these enzymes may function via similar mechanisms and thus be functionally regulated in similar ways. Therefore, the information gained in this work may be useful in understanding the functional properties of a range of viral polymerases. Ultimately, this knowledge may reveal how best to inhibit the function of such polymerases in a therapeutic context.
In addition, our results support the findings of several experimental studies. In particular, a study by Chinnaswamy et al. observed open and closed conformations of NS5B using negative-stain electron microscopy and single-particle reconstruction. Their results suggest that the Δ1 loop regulates the transition from open to closed.23 We observe the Δ1 loop plays an important role in the transition to the fully closed conformation in the work presented here. Our observations indicate that the Δ1 loop is strongly associated with the thumb domain and suggest that the loop plays a role in regulating the thumb domain’s motions. Such a role would be required for it to regulate the transition between open and closed conformations. While PDB structures such as 1*–4* are widely believed to be in the closed conformation, their similarities to structures of viral polymerases known to be in the open conformation (such as 1YV2 and 3OL6) suggest they are more appropriately classified as open. Thus, such structures may not be directly applicable to elucidating the process by which RNA replication is initiated, as this process requires a closed enzyme. It may be necessary to first ensure that the closed conformation of the enzyme is attained (e.g., via MD simulations initiated from these structures).
Finally, we compared the structure of the fully closed conformation seen in this work with that of the related RNA polymerase from bacteriophage 6 (PDB entry 1HI0). This structure represents an initiation complex of the enzyme and is similar to our fully closed conformation in several respects. We identified template and duplex channel residue probes through a structural alignment with NS5B and calculated widths of 14.2 and 13.0 Å for the template and duplex channels, respectively. In addition, we calculated a domain angle of 65.4°. These values are very similar to the corresponding quantities for the fully closed conformation attained by 1b (see Figure 9A), supporting our assertion that this conformation of NS5B is suitable for the initiation of RNA replication.
This study reveals that the presence of magnesium ions and C-terminal residues has a significant impact on the dynamics and structure of NS5B. Magnesium ions induce specific structural and dynamic changes that are consistent with an enzyme that is more competent in initiating RNA replication. In contrast, the C-terminus of the enzyme restricts protein dynamics and prevents the sampling of conformations that are likely to facilitate the initiation of replication. This is the first computational study of apo and metal-bound NS5B structures at these time scales and a first step toward a better understanding of the intrinsic functional dynamics of NS5B and other RNA polymerases. This knowledge may ultimately aid efforts to identify inhibitors of these enzymes that can serve as the basis for antiviral therapies.
This work was supported by the Department of Chemistry and Biochemistry of the University of Maryland, Baltimore County (UMBC). This work made use of the UMBC High Performance Computing Facility that is supported by the National Science Foundation through the MRI program (Grants CNS-0821258 and CNS-1228778) and the SCREMS program (Grant DMS-0821311), with additional substantial support from UMBC. This study also employed resources from the Extreme Science and Engineering Discovery Environment (XSEDE) that are supported by National Science Foundation Grant OCI-1053575.
Supplemental tables and figures. This material is available free of charge via the Internet at http://pubs.acs.org.
The authors declare no competing financial interest.