|Home | About | Journals | Submit | Contact Us | Français|
The protein nanopore Mycobacteria smegmatis porin A (MspA), can be used to sense individual nucleotides within DNA, potentially enabling a technique known as nanopore sequencing. In this technique, single stranded DNA electrophoretically moves through the nanopore and results in an ionic current that is nucleotide-specific. However, with a high transport velocity of the DNA within the nanopore, the ionic current cannot be used to distinguish signals within noise. Through extensive (~100 μs in total) all-atom molecular dynamics simulations, we examine the effect of positively charged residues on DNA translocation rate and the ionic current blockades in MspA. Simulation of several arginine mutations show a ~10–30 fold reduction of DNA translocation speed without eliminating the nucleotide induced current blockages. Comparison of our results with similar engineering efforts on a different nanopore (alpha-hemolysin) reveals a non-trivial effect of nanopore geometry on the ionic current blockades in mutant nanopores.
Nanopores have emerged as versatile research tools for single molecule detection and analysis.1 One of the most prominent applications of nanopores is rapid, label-free, and real-time sequencing of DNA.2 In the original concept of nanopore sequencing,3 single stranded DNA (ssDNA) is electrophoretically driven through a nanopore. Inside the nanopore, DNA nucleotides uniquely modify the co-passing ionic current, yielding an ionic current trace that contains information about the DNA sequence. With a number of recent advances in both biological and solid-state nanopores, nanopore sequencing of DNA is becoming more of a reality.2
The most common nanopores used for DNA transport measurements are naturally occurring or genetically engineered proteins alpha-hemolysin (αHL)4 and Mycobacteria smegmatis porin A (MspA)5 and solid-state nanopores made in thin silicon nitride6–8 or graphene9–11 membranes. In the case of MspA, experimental studies using immobilized DNA strands have already demonstrated the possibility of distinguishing all four DNA nucleotides via ionic current measurement12–14 and detection of single nucleotides in a random sequence background.14 Nanopores have also been used to detect RNA,15–17 epigenetic modifications such as methylated cytosine and 8-oxoguanidine14,18, 19 and DNA damage.19
A common problem with using nanopores for single molecule analysis of DNA is the speed of the analyte through a nanopore. The average speed of DNA transport in free translocation experiments typically exceeds 1 nucleotide per microsecond, which is too high to detect the local structure or sequence of the transported DNA via ionic current measurement. Consequently, a number of techniques have been proposed to reduce and control the speed of DNA transport through nanopores.20, 21 Simple techniques, such as increasing the buffer viscosity or decreasing temperature can reduce the translocation speed22–25 but also reduce the ionic current signal as the ion mobility also decreases. More complex techniques such as coupling the nanopore with processive enzymes that sequentially displace DNA26, 27 through a nanopore have shown to be successful, but require specific conditions to keep the DNA processing enzymes operational. Yet another approach is to modify the nanopore surface by attaching complementary DNA sequences28, modifying the DNA analyte,13, 29 changing the charge state of the nanopore30 or using the steric confinement of a nanopore to arrest DNA motion.31
In this article, we report extensive all-atom molecular dynamics (MD) simulations that assessed the feasibility of slowing DNA transport through modified MspA nanopores and using such nanopores for analysis of nucleic acids. We chose to focus our efforts on MspA, as this nanopore has features32 sufficiently small to enable detection of individual DNA nucleotides by measuring ionic current.13, 14 Our baseline system is the MspA nanopore, denoted M1-NNN, which was previously used to demonstrate nucleotide-specific ionic current blockades. The constriction of this nanopore already contains asparagine substitutions at positions 90, 91 and 93 in all chains of the octomer. We hypothesize that additional positively charged arginine substitutions near the constriction of MspA can reduce the speed of DNA transport while preserving the nucleotide-type specificity of the ionic current blockades.
To determine the effect of arginine mutations on the transport rate of ssDNA through MspA, we constructed all-atom models of the MspA nanopore, embedded in a lipid bilayer membrane and submerged in 1 M KCl electrolyte. Figures 1a shows one such system featuring a full-length MspA nanopore and a 58-nucleotide DNA strand threaded through it using the phantom nanopore method.33 Using the all-atom molecular dynamics method,34 we simulated electric-field driven transport of the DNA strand through several variants of the MspA nanopore. Short (<100 ns in duration) exploratory simulations performed under a 1.2 V transmembrane bias35 revealed transient arrests of ssDNA transport as multiple salt-bridges formed between the phosphate groups of the DNA backbone and the amine groups of arginine substitutions of leucine 88 residues of the L88R MspA nanopore, see Supporting Information Figure S1. However, the high bias used in these simulations prevented the effect of arginine substitutions to fully develop as the energy gain from transporting one nucleotide through the constriction of MspA (~1.2 eV or 46 kBT) is considerably greater than a typical energy of an ionic salt bridge in water (~ 5 kBT). Furthermore, applying a 1.2 V bias in experiment is known to render the bilayer unstable on the experimental time scale.
To observe multiple, microsecond duration trajectories of ssDNA transport through MspA nanopores at the experimental bias of 180 mV, we used an ensemble of sixteen reduced-length MspA systems, each having different starting conformation of a DNA strand in the nanopore. Each simulation system contained a single copy of a reduced MspA nanopore (residues 75 to 120), a poly(dC) strand containing from 13 to 21 nucleotides (depending on the system) and connected to itself via periodic boundary conditions, a patch of a lipid bilayer membrane containing about 60 lipid molecules and 1M KCl solution. Figure 1-c shows one such reduced-length MspA system; all sixteen systems are shown in the Supporting Information Figure S2. To prevent structural deterioration of the truncated nanopore, all Ca atoms of the protein were restrained to their crystallographic coordinates. Our 200-ns simulations of full-length and reduced-length MspA systems revealed very similar electrostatic potential distributions in the nanopore constriction at a 180mV bias, a bias typically used in experiment.
Figure 1-d and 1-e illustrate a typical simulation of a reduced-length M1 MspA system. In that particular trajectory, approximately 33 nucleotides passed through the nanopore constriction in 750 ns, which is consistent with the experimental estimates12 of the nucleotide transport rate. The translocation proceeds in bursts, where long pauses follow rapid advancements, similar to transport observed in αHL.36 The ionic current computed for the same trajectory (Figure 1-e) exhibits large fluctuations and no apparent correlation with the DNA transport rate. An animation provided in the Supporting Information illustrates this trajectory.
Figure 2 shows the number of nucleotides transported through the constriction of MspA versus simulation time for the ensemble of 16 simulation systems and four MspA variants: the baseline M1-NNN nanopore, single arginine mutant L88R-NNN, and two triple arginine mutants L88R/T83R/S116R-NNN and L88R/A96R/S116R-NNN. The location of the arginine substitutions is shown in Figures 2-b, c, and d. Due to the octomeric architecture of MspA, the total number of arginine residues is 8 in L88R-NNN and 24 in the triple mutant nanopores. Within ~14 μs of aggregate simulation time, ~250 nucleotides were observed to pass through the constriction of M1-MspA, against 45 nucleotides observed to pass through the L88R-NNN nanopore within ~12 μs. A linear fit to the cumulative permeation traces suggests a 5-fold reduction of the permeation rate through L88R-NNN nanopore in comparison to M1-NNN MspA. A closer inspection of the trajectories reveals that the permeation rates are not constant for the L88R-NNN nanopore: the nucleotides appear to move faster at the beginning of the individual patches than at the end, as formation of DNA-Arg contacts does not occur instantaneously. Furthermore, even after the contacts are formed, DNA can still be displaced by several nucleotides through the constriction of MspA before the effect DNA-Arg contacts becomes apparent.
In Figure 3, we plot the nucleotide translocation rate averaged over 16 different systems as a function of the time elapsed from the beginning of the simulations. The average transport rate through the M1-NNN MspA nanopore gradually decreases with time and is approximately 20±5 nucleotides/μs. A linear fit to the cumulative trace in Figure 2-a yields a similar value. In contrast, the average permeation rate for the L88R-NNN mutant is clearly higher within the first 200 ns of the trajectories than within the last 450 ns. Excluding the initial 300 ns, the average permeation rate is about 0.7±0.4 nucleotides/μs. Thus, our MD simulations suggest that L88R-NNN mutation can reduce the transport velocity of ssDNA through MspA by at least 20–30-fold, compared to M1–NNN MspA.
Our simulations of the triple mutants suggest the possibility of reducing the DNA translocation velocity even further. Although the total number of nucleotides transported through triple mutants in our ensemble simulations (Figures 2g and 2f) appears to be similar to that observed for the L88R-NNN mutant, the permeation rates (Figure 3-a, inset) drop to zero as the simulation progresses. The trajectories of the two triple mutants were shorter than those of the L88R-NNN mutant and the permeation rates did not reach a steady value. Unable to accurately determine the steady-state transport rate from ensemble simulations, we continued one of the L88R/T83R/S116R-NNN trajectories for additional 4 μs using DESRES supercomputer Anton37 The transport rate obtained from this continuous 4-μs trajectory was close to zero (−0.4 ± 0.6 nucleotides/μs), which indicates a reduction of the DNA transport rate in comparison with an ensemble-average rate of the L88R mutant. An animation provided in the supporting information illustrates the 4-μs trajectory.
In Figure 3-b, we plot the ensemble average of the number of persistent contacts between the DNA and the arginine residues versus the simulation time. Here, a persistent contact is defined to exist between a nucleotide and an arginine residue if the non-hydrogen atoms of the nucleotide reside within a cutoff of 4 Å of the non-hydrogen atoms of the same arginine residue for at least 80% time within 10 ns. For all three mutant nanopores, the number of DNA-arginine contacts increases steeply in the initial 150 ns before reaching an approximately steady-state value, which correlates with the observed reduction of the DNA transport rate, Figure 3-a. For the L88R-NNN nanopore, the number of contacts continues to increase throughout the trajectory at a smaller rate. The L88R/T83R/S116R-NNN mutant formed the highest steady-state number of contacts (~7), the L88R/A96R/S116R-NNN nanopore had about 6 while the L88R-NNN mutant had the least number of contacts (~2–3). A scatter diagram of the number of persistent contacts versus the DNA transport rate, Supporting Information Figure S3, indicates that a higher number of persistent contacts correlates with a smaller rate of DNA transport.
In view of the larger number of contacts formed, the L88R/T83/S116R-NNN is likely to be more effective than the other mutants studied in controlling the translocation of the DNA. Figure S4 in the supporting material shows the number of contacts for each of the mutated residues in the two triple mutant nanopores. Residues R116 are seen to be the most effective in both cases, forming between 2 and 3 contacts with the DNA. Residues R83 are the second most effective, forming ~2 contacts. R88 and R96 residues are seen to form fewer than two contacts with the DNA.
The anticipated effect of arginine mutations was electrostatic attraction between positively charged amine groups and negatively charged DNA backbone. Surprisingly, visual inspection of the MD trajectories revealed two types of persistent DNA-arginine contacts. As expected, phosphate groups of the DNA backbone formed salt bridges with the amine groups of arginine residues, as shown in Figure 3-c. The second type of persistent contact involved stacking of the cytosine base with the arginine side chain, as shown in Figure 3-d. This type of contact is similar to the stacking observed in the crystal structure of the human protein U1A bound to RNA.38 Supporting Information Figure S5 displays the number of salt-bridge and stacking-type contacts for the three mutant nanopores. The L88R/A96R/S116R-NNN nanopore is seen to form a greater number of stacking-type contacts (~3) than salt bridges (~2). For the L88R-NNN and M1L88R/T83R/S116R mutants, the stacking and salt bridge contacts are almost equally prevalent (~1 for the L88R-NNN and ~3 for the L88R/T83R/S116R-NNN mutants respectively).
Recent experimental30 and computational39 studies systematically examined the effect of arginine mutations on the DNA transport rate and ionic current blockades in αHL. In the experiments with αHL, the residual current was ~0 pA when the arginine substitutions were in the stem of αHL, rendering the nanopores less suitable for DNA sequence detection. In our simulations of MspA mutants, we find ionic current blockades in the modified MspA to retain sensitivity to the sequence of DNA strands threaded through the nanopore’s constriction.
To investigate the effect of arginine substitutions on ionic current blockades in both MspA and αHL nanopores, we computed the residual ionic current from our ensemble simulations of M1-NNN and arginine mutants of the MspA nanopore. For comparison, we performed two 150-ns simulations of ionic current blockades in wild-type and triple-mutant αHL nanopores (Figure 4), using previously reported computational models.33,36 For MspA we find the residual current for M1 to be 127 ± 8 pA and for L88R to be 120 ± 6 pA. These values are indistinguishable within the statistical uncertainties. Here and in the rest of the manuscript, the standard error was computed using current values block-averaged over 100 ns and the 16 independent trajectories. The average currents through triple mutants were identifiably greater with currents of 164 ± 13 pA for L88R/T83R/S116R-NNN and 205 ±15 pA for L88R/A96R/S116R-NNN. The simulated blockade currents are systematically higher than those measured experimentally using NeutrAvidin anchors12–14 as the outer rim of the MspA nanopore are absent in our reduced-length systems. In terms of percentage of the open nanopore currents, the average simulated blockade current is 13.8±1.2% for the M1-NNN nanopore, 13.2 ±0.9 % for the L88R, 19.2±1.9% for the L88R/T83R/S116R and 23.8±1.8 % for the L88R/A96R/S116R mutations. Figure 4c shows the distributions of the blockade currents for M1 and L88R/A96R/S116R nanopores. Thus, the residual current through triple mutant MspA nanopore is almost twice the value of the M1 nanopore. Figure 4-d shows similar data for our simulations of αHL with three arginine substitutions resulting in 21 additional positive charges. In the case of αHL, triple arginine substitutions completely block the nanopore current. Thus, the arginine substitutions have opposite effect on the residual current in αHL and MspA, which, as we show below, originates from the difference in the shape of the two nanopores.
For both M1 MspA and WT αHL nanopores, a negatively charged DNA backbone inside the nanopores ensures that the total ionic current is primarily carried by potassium ions (87% of total current in MspA and >90% of total current in αHL). With the additional positive charges in the triple arginine αHL mutant, the current of K+ drops to zero while the current of Cl− remains at the same (negligible) level (Figure 4-b). In contrast, both K+ and Cl− currents increase when positively charged arginine side chains are introduced near the constriction of MspA (Figure 4-a). The change in the ion density in both nanopores (Figures 4-e,f) reveals the microscopic effect of the positive charges. For αHL, the arginine substitutions deplete K+ ions in the stem of the channel and do not considerably affect the distribution of Cl− (Figure 4-e). However, for MspA, the arginine substitutions considerably increase Cl− concentration in the vestibule of MspA without altering the density of K+ near the trans entrance of the nanopore. The difference in the effects arises from the shape of the channels. The stem of αHL is so narrow that the negatively charged backbone of DNA serves as a poly-counterion to the positively charged side chain of the substitutions and vice versa, excluding ions. In contrast, the vestibule of MspA is considerably wider than the Debye length at 1M KCl, which allows arginine side chains at one side of the channel to interact with DNA, while the remaining side chains to attract Cl− ions. The open geometry combined with the arginine substitutions allows an increase in the overall concentration of charge carriers in the channel, thereby increasing the residual current.
To demonstrate that arginine mutations in MspA do not eliminate the nucleotide sensitivity of ionic current blockades, we repeated our ensemble simulations of the M1, L88R and the two triple mutant nanopores using thymine homopolymers. Similar to the poly(dC) systems, we observed reduction of ssDNA translocation velocity in the poly(dT) systems, Figure S6. The average currents recorded for poly(dC) and poly(dT) strands and the four MspA systems are specified in Table 1. In the case of M1 MspA, the ensemble-average current of the poly(dT) systems is considerably greater than that of poly(dC), in agreement with experiment14. Similar currents are recorded in the case of M1 L88R mutant. Rather surprisingly, the blockade current difference between poly(dT) and poly(dC) systems in the triple arginine mutants differs from those in the M1 and M1 L88R systems. In the case of L88R/A83R/S116R, the current blockade difference is within the statistical error, however, in the case of L88R/A96R/S116R, the ensemble-average current of the poly(dC) systems is considerably greater than that of the poly(dT) systems, opposite to what is observed in M1 MspA. Such a reversal of the blockade current difference cannot be directly linked to the increased current of chloride ions, as the fractional currents carried by ionic species are similar for the two DNA sequences.
In general, the charge state of a nanopore can have a non-trivial effect on the overall transport rate of a charged analyte. Thus, introducing negatively charged side chains in the lumen of alpha-hemolysin was shown to both catalyze the transport of short positively charged peptides 39 and slow the transport of longer positively charged polypeptides.40 Hence our arginine mutations to MspA may both increase the DNA capture12 and slow the DNA transport.
In this study, we performed extensive (~100 μs in total) all-atom MD simulations to investigate electric field-driven transport of single DNA strands and ions through biological nanopore MspA containing arginine mutations. In nanopores decorated with the positive charges above the constriction, persistent contacts between the arginine substitution and DNA backbone, or by base-stacking interactions, could slow the DNA translocation by a factor 10–30. Because of MspA’s geometry, the charge substitutions did not eliminate nucleotide-specific current. These results reveal the importance of nanopore geometry and charge location and may help direct the modification of nanopores such as MspA to improve their utility for nanopore sequencing and other nanopore technologies. Furthermore, our simulations demonstrate that precise placement of arginine mutations can have a non-trivial effect40 on the sequence specificity of the ionic current blockades.
The MD simulations were carried out using program NAMD241 on Cray XT-5 machines or using D.E. Shaw Research’ special purpose Anton supercomputer.37 All simulations employed periodic boundary conditions and multiple timestepping41, 42 wherein local interactions were calculated every 2 fs time step and full electrostatic evaluations were performed every 3 time steps. NAMD simulations employed the particle mesh Ewald method for evaluation of long-range electrostatics.41, 42 Simulations on Anton were performed using the Nose’-Hoover NVT integrator with a k-Gaussian Split Ewald method43 for long-range electrostatics. The CHARMM2744 parameter set, along with TIP3P45 water model were used in all simulations. All covalent bonds involving hydrogen were kept rigid in water and other molecules using SETTLE46 or RATTLE47 algorithms. The cutoff and switching distance for evaluation of van der Waals and short-range electrostatic interactions were set to 8 and 7 Å, respectively.48, 49
After a 2000 step minimization, each systems was equilibrated using NAMD for at least 2 ns in the NPT ensemble using Nose’-Hoover Langevin piston pressure control5 at 295 K with no external electric field applied. Following equilibration, all NAMD simulations were performed in the NVT ensemble using the Lowe Andersen thermostat51 with a voltage bias of 180 mV imposed across the membrane as described in.34, 35 In all simulations, harmonic restraints with a spring constant of 695 pN/nm applied to all Ca atoms of the protein, restraining them to the crystallographic coordinates. Additional simulations were carried out on Anton to obtain a continuous 4-μs long trajectory of ssDNA transport through the truncated L88R-NNN/T83R/S116R nanopore, our trajectories each of 400 ns for the M1 L88R nanopore and nine trajectories for the M1 nanopore varying between 200 and 360 ns. Each Anton simulation was a continuation of the corresponding NAMD trajectory and employed the same force fields and constraints.
Atomic coordinates of the MspA porin were obtained from the Protein Data Bank (entry 1UUN)5. Eight R96 residues resolved in the crystal structure were changed to alanines in accord with the sequence of WT MspA. 24 aspartate residues in the constriction of the porin were replaced by asparagines to create the D20N/D91N/D93N mutant referred to as M1-NNN MspA. The resulting structure was merged with a 12×12 nm2 patch of POPC lipid bilayer such that the hydrophobic β-barrel of MspA embedded in the bilayer while its hydrophilic cap protruded above the membrane. All lipid molecules overlapping with the nanopore were removed; the remaining bilayer contained 272 lipid molecules. Using the phantom nanopore method,33 a 58 -nucleotide random sequence DNA strand was placed inside the nanopore of MspA with the backbone approximately aligned with the nanopore axis and the 3′ end of the strand emerging at the trans end of the nanopore. The system was immersed in ~35,000 pre-equilibrated TIP3P water molecules. K+ and Cl− ions were added at random positions in the water layers corresponding to a concentration of 1 M. Additional K+ ions were added to compensate the charge of the MspA nanopore and the DNA fragment, bringing the total charge of the final system to zero. The final system, shown in Figure 1-a, consisted of ~167,000 atoms and measured 12 nm × 12 nm × 17 nm.
The minimal model of the MspA nanopore was constructed by eliminating the vestibule part of MspA that extended away from the lipid membrane on the cis side in our full-length model (Figure 1a–c). The reduced nanopore contained residues 75 to 120. An 8 nm × 8 nm patch POPC bilayer containing 60 lipid molecules was merged with the truncated nanopore. An ensemble of sixteen systems, each differing in the initial conformation of the DNA, was created. The sixteen different DNA configurations were obtained from a ~110-ns trajectory of full-length MspA at a 1.2~V transmembrane bias. Each DNA/protein/lipid system was solvated in a volume of pre-equilibrated TIP3P water containing about 3800 molecules. DNA segments that extended outside the water box were deleted, the nucleotide sequence of the DNA strand was changed to comprise a cytosine (or thymine) homopolymer. K+ and Cl− ions were added at random positions to 1 M concentration and bring the total charge of each system to zero. Following usual minimization and equilibration protocol, the ends of each DNA strand were brought together (across the periodic boundary) in a 240 ps SMD simulation.52 Subsequently, the ends were covalently joined via a custom patch to create an effectively infinite DNA strand joined through periodic boundary conditions. The electrostatic profiles in full-length and reduced MspA systems, Figure 1-b, appear to be very similar, and so are the forces applied to DNA and ions, which validates our reduced nanopore approach.
In addition to M1-NNN MspA nanopore, the following three additional variations were constructed: a single mutant L88R-NNN, with all LEU88 replaced by arginines, and two triple mutants L88R/T83R/S116R-NNN and L88R/A96R/S116R-NNN where all LEU88, THR83, SER 116 or LEU88, ALA96, SER116 were replaced by arginines. Figure 2(b–d) illustrates the location of the mutations. Each reduced system consisted of about 28,000 atoms. Using custom scripts, all sixteen systems of the ensemble could be simulated in parallel on an XT-5 system, producing ~1 μs of aggregate simulation time within 24 hours.
This work was supported by grants from the National Institutes of Health (HG-005115) and the National Science Foundation (DMR-095595). Computer time was provided via Teragrid/XSEDE allocation MCA05S028, Department of Energy INCITE program and the National Resource for Biomedical Supercomputing at the DESRES supercomputer Anton at the Pittsburgh Supercomputing Center (PSCA00052). A. A. would like thank the Department of Bionanosciences at Delft University for hospitality and support from the Netherlands Organisation for Scientific Research (NWO)
Supporting Information Available: MD simulations of full-length MspA and its mutants; molecular graphics images of the sixteen ensemble systems; plots of number of DNA-arginine contacts vs permeation rate; plots of the number of persistent contacts vs time for the two triple mutants; plots of the number of electrostatic and stacking contacts; ensemble permeation rates of the thymine homopolymers; animations illustrating DNA permeation trajectory in M1-NNN and triple mutant MspA nanopores. This material is available free of charge via the Internet at http://pubs.acs.org.