A commercial nanopore device for de novo DNA sequencing will require integration of six features: i) Automated capture and processing of genomic DNA templates in single file order from a heterogeneous mixture over many hours. This is essential to eliminate extensive DNA sample preparation common to other sequencing technologies and thus fully exploit the speed of nanopore sequencing; ii) systematic spatial control at circa 5 Å precision; iii) temporal control at 0.1-1000 ms per nucleotide; iv) absence of complex active voltage control. This is necessary to avoid crosstalk in a compact electronic array of thousands of nanopores; v) a sensor that can determine single base identity; and vi) a counter that can identify transitions between nucleotides along homopolymeric regions.
Devices based on the pore-forming protein α-HL are common in nanopore technology 3-5, 8
. Briefly, a single α-HL nanopore is inserted in a lipid bilayer that separates two wells that contain 100 μl each of a buffered KCl solution (
). Negatively charged single-stranded DNA (ssDNA) is added to the cis
well. A voltage applied between the wells (trans
side +) causes the ssDNA to enter and electrophorese through the nanopore. This event results in a brief current blockade that is influenced by DNA strand length3
and base composition5
Figure 1 Experimental set-up. (a) Nanopore device. A single α-HL nanopore is inserted in a lipid bilayer that separates two wells, each containing 100 μl of a buffered KCl solution. DNA bearing a ssDNA segment is added to the cis well. A voltage (more ...)
A consensus has emerged that the average rate of ssDNA electrophoresis through nanopores (approximately 3 μs nt-1
at 120 mV for α-HL) is too fast to allow accurate base identification 6
. Therefore a functional nanopore sequencing device will require a means to systematically slow DNA template movement. One proposed strategy involves coupling a protein motor to the nanopore 7
. This strategy is attractive because many processive enzymes including polymerases ratchet along DNA strands one nucleotide at a time, up to tens of thousands of times in succession in bulk phase 9
. In addition, as the polymerase drives the DNA strand through the nanopore sensor, the force of the electric field acting in the opposite direction is predicted to hold the strand taut, and therefore reduce base read errors caused by Brownian motion10
Thus far most progress has been made with DNA polymerases. T7 DNA polymerase (T7 DNAP) was shown to bind to DNA captured in the α-HL pore, and then catalyze nucleotide additions that advanced the template strand through the pore against an 80 mV applied voltage 11
. Replication was blocked in bulk phase using synthetic DNA ‘blocking oligomers’, and was initiated only after each DNA template was captured by the nanopore with attendant removal of the block by voltage-driven unzipping. Although an important proof of concept, this method could not be used in a practical sequencing device for two reasons: 1) at most three sequential ionic current steps could be observed before the T7 DNAP dissociated from the DNA template under load; and 2) to remove the blocking oligomer and subsequently bind T7 DNAP, the DNA was tethered in the pore and driven back and forth by reversing voltage polarity at 10 ms intervals. This would result in an unacceptable level of crosstalk between pores in a compact commercial array.
More recent experiments showed that a B family polymerase, phi29 DNAP, remained bound to DNA captured in the α-HL pore approximately 10,000-fold longer than did A family polymerases12
. DNA replication by phi29 DNAP controlled sequential movement of at least 50 bases through the α-HL pore from a precise starting point on a DNA primer strand without active voltage control. The rate of elongation and template displacement was tens of milliseconds per nucleotide, consistent with previously measured rates 13,14
. However, these experiments relied upon transient chemical protection of the DNA primer terminus to prevent elongation and excision in bulk phase 12
. This approach permitted only a 20 minute window to capture unmodified DNA from bulk phase, and therefore is not suitable for a commercial sequencing device.
In the work presented here we combine phi29 DNAP-dependent template control with an improved blocking oligomer strategy for capture, activation and electrophoresis of up to 500 DNA molecules in single file order through individual nanopores. Importantly, we discovered that blocking oligomers potentiated formation of phi29 DNA polymerase-DNA complexes in bulk phase that were quiescent for at least five hours. This eliminated the need for active voltage control to effect polymerase binding to DNA. It also enabled automated forward and reverse ratcheting of each DNA strand through the pore, exploiting phi29 DNAP in two distinct mechanistic modes for template strand analysis.
illustrates features of the blocking oligomer designed for use with phi29 DNAP. The DNA substrate is a 23-nucleotide (nt) primer annealed to a synthetic 70-nt DNA template (Supplementary Fig. 1
). To protect the DNA primer from phi29 DNAP-dependent extension and digestion in bulk phase, a blocking oligomer (dashed line,
) is annealed immediately adjacent to the DNA primer/template (p/t) junction. The blocking oligomer includes a ~25-nt complement to the template strand and two acridine (z) residues attached to the 5′ end (
). One of these acridines substitutes for a nucleotide and abuts the primer terminus; the other is an added 5′-overhang that is presumed to intercalate into the DNA duplex. The blocking oligomer was appended with a 3′- three carbon spacer (s) followed by seven abasic (1′, 2′-H) residues (x's). This appended segment has two functions: protection of the blocking oligomer against exonucleolysis by phi29 DNAP (Supplementary Fig. 2
), and facilitation of blocking oligomer removal as the phi29 DNAP-DNA complex is pulled into the nanopore by an applied voltage.
The blocking oligomer protected the DNA primer strand from phi29 DNAP-dependent extension and digestion in bulk phase (). In these experiments, DNA substrates and phi29 DNAP were incubated in nanopore buffer (0.3 M KCl, 10 mM Hepes/KOH pH 8, 1 mM EDTA, 1 mM DTT) supplemented with 10 mM Mg2+ for 5 hours at 23 °C. The products were then analyzed by denaturing polyacrylamide gel electrophoresis. Absent the blocking oligomer, phi29 DNAP digested the DNA primer strands (-dNTP, lane 3) or extended them (+dNTP, lane 4). In contrast, when protected by the blocking oligomer, the primer strands were neither digested (-dNTP, lanes 6), nor extended (+dNTP, lanes 7) by phi29 DNAP.
Our next objective was to remove the blocking oligomer from each individual DNA template captured in the nanopore so that phi29 DNAP could bind at the (p/t) junction. Initially we considered a proven strategy wherein active voltage control was used to unzip the blocking oligomer from the DNA template upon capture, followed by a voltage polarity reversal to drive the newly exposed DNA primer-template junction into the cis
well to ‘fish’ for a polymerase molecule 11,15
. This complex method proved to be unnecessary. shows an experiment using a 94mer DNA template strand bearing five abasic (1′, 2′-H) residues spanning positions +25 to +29 from the n=0 position (
). This abasic insert serves as an ionic current reporter during strand displacement through the α–HL pore 11-12,16-17
. The DNA template was annealed to a 23-nt primer, and the 3′-terminus of the primer strand was protected from bulk phase modification by the blocking oligomer described above (
Figure 2 Forward and reverse ratcheting of DNA through the nanopore. (a) DNA substrate protected by a blocking oligomer. A blocking oligomer (red line) capped by two acridine residues at its 5′ end protects the primer from catalysis in bulk phase. A 94mer (more ...)
Addition of this DNA construct alone to the nanopore cis
chamber resulted in ionic current blockades with a median duration of ~4ms and an average residual current of 22.5 pA that are similar to translocation events for DNA substrates bearing short duplex regions described previously18
. Events greater than 200 ms duration were rare. By comparison, shows an ionic current trace typical of 200 events when phi29 DNAP and dNTPs were subsequently added (see also Supplemental Videos 1 & 2
). Capture of a DNA substrate molecule (
) resulted in a 23-24 pA residual ionic current that lasted several seconds (
). Under a sustained 180 mV load, the ionic current then stepped through a series of discrete levels that traversed a 35 pA maximum (
) before dropping to 22 pA and settling at a characteristic 25 pA amplitude (
). These current levels were caused by sequential movement of the five abasic residues of the template strand through the α–HL trans-membrane pore. This effect is especially pronounced as the abasic residues enter then pass through the α–HL limiting aperture circumscribed by lys147 12
. Upon reaching the 25 pA amplitude, the ionic current steps reversed direction and retraced the 35 pA peak (
) at about ten times the speed that the first peak was traversed, before stalling at 24 pA (
). In this experiment, there were also 178 events where the ionic current series began as shown in , but did not progress completely through the two peaks either due to enzyme dissociation (112/178), a stall within the first ionic current peak (23/178), or a stall in the second ionic current peak (44/178).
These nanopore data suggest that when phi29 DNAP was added to the nanopore bath in the presence of the protected DNA substrates, it formed stable but enzymatically inactive complexes due to the presence of the blocking oligomers. Activation of a given complex was achieved only upon nanopore capture. Successive stages of this hypothetical process are illustrated in : i) the open channel; ii) nanopore capture of a polymerase-DNA complex with a blocking oligomer bound; iii) mechanical unzipping of the blocking oligomer promoted by the applied voltage, which ratchets the DNA template through the nanopore. This gives rise to the first 35 pA current peak as the abasic insert traverses the major pore constriction; iv) release of the blocking oligomer, which exposes the 3′-OH terminus of the DNA primer within the polymerase active site; v) DNA replication by phi29 DNAP, which ratchets the template in the reverse direction through the nanopore, giving rise to the second 35 pA current peak; vi) stalling of DNA replication when the abasic residues of the template strand reach the catalytic site of phi29 DNAP.
This model makes three testable predictions. First, traversal of the first 35 pA peak due to voltage-driven unzipping of the blocking oligomer should be independent of phi29 DNAP catalytic capability. Therefore it should be observed in the absence of the Mg2+
ions required for both polymerase and exonuclease function. In experiments in which complexes of phi29 DNAP and the substrate shown in were captured absent free Mg2+
, the first 35 pA peak was indeed traversed, followed by a stall at the 25 pA level, and eventual voltage-promoted dissociation of the complex (Supplementary Fig. 3)
. The second 35 pA peak was not observed in the absence of Mg2+
, supporting the second prediction of the model: because traversal of the second 35 pA ionic current peak requires DNA replication, it should be dependent upon the presence of both Mg2+
and a full complement of dNTP substrates. As an additional test, traversal of this second peak should stall if one of the required dNTP substrates was withheld. Results consistent with this prediction are described in Supplementary Figure 3
The third prediction of this model is that progression into the proposed replication-dependent peak should be influenced by the chemical identity of the DNA primer terminus. In particular, substitution of the 3′-OH terminus with a 3′-H terminus should delay appearance of the second 35 pA current peak by causing a stall as the p/t junction is positioned in the polymerase active site (iv in
). This prediction also proved to be correct (
). That is, using a substrate bearing a 3′-H terminated primer, the first 35 pA peak was traversed as it was with the substrate bearing a 3′-OH terminus, due to voltage-driven unzipping of the blocking oligomer. The ionic current then stalled for several seconds at 25 pA (red horizontal arrow,
). Eventually, traversal of the second 35 pA peak was observed. This recovery was due to excision of the 3′-H terminated residue by the phi29 DNAP exonuclease and subsequent strand elongation beginning at the newly exposed 3′–OH of the neighboring dGMP nucleotide 12
Together these experiments indicate that phi29 DNAP can be used to control forward and reverse ratcheting of individual DNA templates through the α-HL pore. In the forward direction, the template strand is driven 5′-to-3′ through the nanopore by applied voltage as its complementary blocking oligomer is unzipped at the bound phi29 DNAP enzyme. In the reverse direction, replication by phi29 DNAP extends the primer strand and thus biases movement of the template in the 3′-to-5′ direction relative to the pore. To quantify the average rate of DNA template movement per nucleotide in each mode, it was necessary to account for all 25 template nucleotides in each direction along the trace. We first established an ionic current map by building a composite derived from 10 traces that traversed both amplitude peaks (
). Thirty-two reproducible amplitudes were resolved for more than one-half of the replicated DNA templates using a 3 ms minimum cutoff. These amplitude steps were symmetric around a 25 pA midpoint (ionic current state 0) except for state -1 which was not observed in a majority of traces. To confirm that the 16 ionic current states in the replication-dependent peak correspond to displacement of 25 template nucleotides, we measured translocation pauses when one dNTP substrate at a time was reduced to 100 nM in the nanopore buffer while all other dNTPs were held at 100 μM (Supplementary Fig. 4
). As anticipated, when these concentration-dependent pauses were assembled in logical order, 25 nucleotide additions to the DNA primer strand could be accounted for within ionic current states 0 to 16 of the map (n0
). Along this 25 nucleotide DNA segment, the median rate of replication was 40 nt s-1
(IQR = 12 nt s-1
, n=200). The ionic current states during voltage-driven unzipping (states 0 to -16) mirrored the states observed during replication. If we assume 25-nucleotide displacement during that process as well, the median rate of translocation during unzipping was measurably slower (median = 2.5 nt s-1
, IQR= 3.2 nt s-1
Figure 3 Reproducible ionic current states as DNA is ratcheted through the nanopore. The DNA template and experimental conditions are identical to those in . (a) Representative ionic current trace as a single DNA molecule is ratcheted through the pore. (more ...)
To increase throughput we reduced the length of the blocking oligomer segment annealed to the target DNA template from 25 to 15 nucleotides (in the context of the blocking oligomer design shown in , ii). When annealed to a p/t substrate (Supplementary Fig. 5a
), this blocking oligomer still afforded protection of DNA in bulk phase (Supplementary Fig. 5b
) but allowed faster removal of the blocking oligomer at the nanopore. With this shorter blocking oligomer, up to 500 molecules were processed in single file order at a rate of 130 per hour through one pore.
We next determined the probability of template registry errors that occur when using this phi29 DNAP control strategy (). There are two types: 1) errors that occur when the rate of strand displacement past the pore sensor exceeds the rate of data acquisition (the nucleotide at a given position would be missed or ‘deleted’ (compare )); and 2) errors that occur when the strand slips back and forth so that a given position is read more than once (a nucleotide would be falsely ‘inserted’ into the sequence order (compare )). Together these would result in registry-dependent “indel” errors - insertion or deletion of bases relative to the correct sequential series.
Figure 4 Estimating DNA template registry errors in the nanopore during phi29 DNAP controlled translocation. The DNA template and experimental conditions are identical to those described in . (a) Example ionic current traces. The traces shown are for the (more ...)
To accurately measure the frequency of these registry errors, it was essential to focus on ionic current states that correspond to single nucleotide positions bounded by clearly discernible neighboring states. The data in suggested that state 8 (bounded by states 7 and 9), and state 11 (bounded by states 10 and 12) within the replication-dependent peak satisfy these criteria. This was confirmed using a DNA mapping strategy described previously12-14
( & Supplementary Fig. 6
). By extension, the mirror image ionic current states -8 and -11 of the voltage-driven peak were also used.
The model we used to quantify registry errors is illustrated in for the replication-driven ratchet with examples in . A correct read was called when the current advanced from ionic current state 7 to state 8 and resided there for at least 3 ms before advancing to ionic current state 9 (, green arrows; & ). A deletion error was called when the current advanced from ionic current state 7 directly to state 9 without residing in state 8 for at least 3 ms (, red dashed arrow; & ). An insertion error was called when the current advanced from ionic current state 7 to state 8 and resided there for at least 3 ms, but then returned to state 7 for at least 3ms before advancing once again through state 8 and then to state 9 ( , grey dashed arrow; & ). Corresponding arrows centered at states -8 and -11 in (voltage-driven zipper) and state 11 in (replication-driven ratchet) have the same meaning.
Error estimates for individual molecules read once in each direction are summarized in . We found that the insertion error probability at one nucleotide spatial precision ranged from 5% to 10.5%, and the deletion error probability ranged from 5% to 15%. The combined error probability (either an insertion or deletion at a given position in a single pass) ranged from 10% to 24.5%. Errors at position 11/-11 were about ½ as frequent as errors at positions 8/-8.
We conclude that DNA substrates pre-bound to phi29 DNAP can be protected from enzymatic modification for many hours using blocking oligomers, and that activation of each complex for replication occurs upon capture on the α–HL pore. Because the DNA molecules are enzymatically modified only at the nanopore, it is possible to combine all components of the replication reaction (i.e. DNA template, dNTP, Mg2+, polymerase) in the nanopore chamber at time zero and run a lengthy analysis of many DNA templates without further user intervention. Thus, DNA strand sequence could be determined for an unknown template in a homogeneous preparation by analyzing a series of single molecules. Alternatively, DNA strand sequencing could be performed on molecules captured one-byone from a heterogeneous mixture (e.g. a genomic digest) with multiple reads per individual strand to achieve consensus.
Finally, processive circa
5 Å DNA template displacement was documented at two positions each for the voltage-driven zipper and for the replication-driven ratchet. This allowed us to make initial probability estimates for registry errors due to uncontrolled strand motion at the single nucleotide scale. These DNA template registry errors (10% to 24.5% combined probability for insertions and deletions at a given position) must be reduced for a commercial nanopore sequencing device. This will be a fruitful area of research because several factors that influence template motion and the ionic current signal-to-noise ratio have not been optimized. These include ionic strength, polymerase mechanics, enzyme identity, applied voltage, and composition of the trans
-side buffer. Even absent improvements, proof of concept DNA sequencing is conceivable using this robust control strategy when coupled to nanopores capable of nucleotide discrimination18-20