We devised a system that exploits the continuous culture and selection of the M13 filamentous bacteriophage11
(commonly used in phage display12
) to enable the continuous directed evolution of proteins or nucleic acids. In phage-assisted continuous evolution (PACE), E. coli
host cells continuously flow through a fixed-volume vessel (the “lagoon”) containing a replicating population of phage DNA vectors (“selection phage”, SP) encoding the gene(s) of interest (Supplementary Fig. 1
The average residence time of host cells in the lagoon is less than the time required for E. coli replication. As a result, mutations accumulate only in the evolving SP population, the only DNA that can replicate faster than the rate of lagoon dilution. The mutation of host cells in the lagoon should therefore have minimal impact on the outcome of the selection over many rounds of phage replication, and mutagenesis conditions are not limited to those that preserve E. coli viability.
PACE achieves continuous selection by linking the desired activity to the production of infectious progeny phage containing the evolving gene(s). Phage infection requires protein III (pIII; encoded by gene III), which mediates F pilus binding and host cell entry.13
Phage lacking pIII are ~108
-fold less infectious than wild-type phage.14
Crucially, the production of infectious phage scales with increasing levels of pIII over concentrations spanning two orders of magnitude.15
To couple pIII production to the activity of interest, we deleted gene III from the phage vector and inserted it into an “accessory plasmid” (AP) present in the E. coli
host cells (see Supplementary Fig. 2
for plasmid maps). The production of pIII from the AP is dependent on the activity of the evolving gene(s) on the SP. Only phage vectors able to induce sufficient pIII production from the AP will propagate and persist in the lagoon (). Because pIII expression level determines the rate of infectious phage production,15
phage encoding genes that result in a higher level of pIII production will infect more host cells than phage encoding less active genes.
Figure 1 Overview of the PACE system. PACE in a single lagoon. Host cells continuously flow through a lagoon, where they are infected with selection phage (SP) encoding library members. Functional library members induce production of pIII from the accessory plasmid (more ...)
Due to the speed of the phage life cycle (progeny phage production begins ~10 minutes post-infection),16
PACE can mediate many generations of selective phage replication in a single day. We observed activity-dependent phage vectors that tolerate lagoon flow rates up to 3.2 volumes per hour (Supplementary Fig. 3
), corresponding to ~115 population doublings and an average of ~38 phage generations per 24 hours (see the Supplementary Information
for an analysis). More conservative flow rates of 2.0–2.5 volumes per hour allow 24–30 generations per day and reduce the risk of complete phage loss (washout) during selections. Multiple lagoons can evolve genes in parallel, with each 100 mL lagoon containing ~5×1010
host cells selectively replicating active phage variants. Importantly, PACE requires no intervention during evolution and obviates the need to create DNA libraries, transform cells, extract genes, or perform DNA cloning steps during each round.
In principle, PACE is capable of evolving any gene that can be linked to pIII production in E. coli
. Because a wide variety of functions including DNA binding, RNA binding, protein binding, bond-forming catalysis, and a variety of enzyme activities have been linked to the expression of a reporter protein,17,18
PACE can be applied to the evolution of many different activities of interest. As examples, we successfully linked protein-protein binding, recombinase activity, and RNA polymerase activity to phage infectivity in discrete infection assays by creating variants of the AP that associate each of these activities with pIII production ().
Figure 2 Linkage of three protein activities to pIII production and phage infectivity using three distinct APs. E. coli cells containing APs encoding conditionally expressed gene III (left) and selection phage were combined with recipient cells. Phage production (more ...)
PACE applies optimal evolutionary pressure when pIII levels are above the minimal threshold required to prevent phage washout, but below the amount needed to maximize infectious phage production. This window can be shifted by varying the copy number of the AP, or by altering the ribosome-binding site (RBS) sequence of gene III to modulate the efficiency with which gene III is transcribed or translated (Supplementary Fig. 4
We constructed an arabinose-inducible mutagenesis plasmid (MP) that elevates the error rate during DNA replication in the lagoon by suppressing proofreading19
and enhancing error-prone lesion bypass (Supplementary Information
Full induction increased the observed mutagenesis rate by ~100-fold, inducing all possible transitions and transversions (Supplementary Fig. 5
). This enhanced mutation rate is sufficient to sample all possible single and double mutants of a given sequence each generation (Supplementary Information
), in principle enabling single-mutation fitness valleys to be traversed during PACE.
Bacteriophage T7 RNA polymerase (T7 RNAP) is widely used to transcribe RNA in vitro
and in cells. T7 RNAP is highly specific for its promoter sequence (TAATACGACTCACTATA), and exhibits virtually no detectable activity on the consensus promoter of the related bacteriophage T3 (A
A, differences underlined).21,22
Despite decades of study and several attempts to engineer the specificity of T7 RNAP towards other promoters22,23
including that of T3, a mutant T7 RNAP capable of recognizing the T3 promoter has not been previously reported.
To remove potential interference from evolutionary improvements to the phage vector rather than to T7 RNAP, we propagated an SP expressing wild-type T7 RNAP for three days on host cells containing an AP with the wild-type T7 promoter driving gIII expression. A single plaque presumed to represent vector-optimized SP contained a single mutation (P314T) in T7 RNAP. We confirmed that the activity of the P314T mutant does not significantly differ from that of wild-type T7 RNAP (Supplementary Fig. 6
This starting SP failed to propagate on host cells containing the T3 promoter AP. We therefore propagated the phage on cells containing a hybrid T7/T3 promoter AP with the T7 promoter base at the important -11 position21
but all other positions changed to their T3 counterparts. Two initially identical lagoons were evolved in parallel on the hybrid promoter AP for 60 hours, then on the complete T3 promoter AP for 48 hours, and finally on a high-stringency, very low-copy T3 promoter AP for 84 hours ().
Figure 3 Continuous evolution of T7 RNAP variants that recognize the T3 promoter. (a) PACE schedule. (b) Activity in cells of T7 RNAP variants isolated from lagoon 1 at 48, 108, and 192 hours on the T7 and T3 promoters. Transcriptional activity was measured spectrophotometrically (more ...)
In both lagoons phage persisted after 8 days of PACE, surviving a net dilution of 10167 fold, the equivalent of 555 phage population doublings and ~200 rounds of evolution by the average phage. We isolated, sequenced, and characterized phage vectors from each lagoon after 48, 108, and 192 hours, observing up to eight, ten, and 11 non-silent mutations in single T7 RNAP genes at each time point.
Protein-encoding regions (without upstream promoter sequences) of evolved mutant T7 RNAP genes were subcloned into assay plasmids that quantitatively link transcriptional activity to beta-galactosidase expression in cells.24
We defined the activity of wild-type T7 RNAP on the T7 promoter to be 100%. The starting T7 RNAP exhibited undetectable (< 3%) levels of activity on the T3 promoter in these cell-based assays. The assayed mutants exhibited > 200% activity after 108 hours of PACE, and > 600% activity following high-stringency PACE at 192 hours, improvements of more than 200-fold (). These results collectively establish the ability of PACE to very rapidly evolve large changes in enzyme activity and specificity with minimal intervention by the researcher.
Several evolved T7 RNAP mutants were also purified and assayed in vitro
using radioactive nucleotide incorporation assays. Purified T7 RNAP mutants exhibited activity levels on the T3 promoter in vitro
exceeding that of wild-type T7 RNAP on the T7 promoter, representing improvements of up to 89-fold compared with the starting enzyme (Supplementary Fig. 7
), These results indicate that PACE resulted in large improvements in substrate binding or catalytic rate. Evolved activity improvements were higher in cells than in vitro
, suggesting that these enzymes also evolved improvements in features such as expression level, polymerase folding, or stability that are specific to the context of the cytoplasm.
Interestingly, the evolutionary dynamics of the two initially identical lagoons differed significantly ( and Supplementary Results
). Within 24 hours, lagoon 1 acquired a predominant suite of mutations consisting of I4M, G175R, E222K, and G542V and changed little thereafter beyond acquiring N748D, a mutation known to enable recognition of the T3 base at the -11 position,21
following exposure to the full T3 promoter. In contrast, lagoon 2 accessed these mutations more slowly before a different suite of mutations also including N748D became predominant at 108 hours, only to be displaced by the same suite of mutations observed in lagoon 1. The presence of several mutations unique to lagoon 2 throughout the experiment suggests that lagoon cross-contamination did not occur. The distinct evolutionary trajectories of the two lagoons prior to their ultimate convergence upon a common set of mutations highlight the ability to PACE to rapidly discover multiple viable pathways to a target activity in parallel experiments. This capability may enable a more in-depth experimental study of protein evolutionary dynamics than can be achieved with conventional directed evolution methods that cannot complete so many rounds of evolution on a practical time scale.
T7 RNAP is highly specific for initiation with GTP,25,26
significantly limiting its usefulness for the in vitro
transcription of RNAs that begin with other nucleotides. As initiation has been described as a mechanistically challenging step in transcription,27
we next used PACE to evolve T7 RNAP variants capable of initiating transcription with other nucleotides in a template-directed manner. T7 RNAP is known to preferentially initiate with GTP up to several bases downstream of the +1 position if the template is devoid of early guanines in the coding strand.26
We therefore constructed accessory plasmids in which positions +1 through +6 of the gene III transcript were AAAAAA (iA6
) or CCCCCC (iC6
We used PACE to rapidly evolve variants of T7 RNAP capable of initiating with ATP. In light of previous reports indicating varying degrees of initiation of T7 RNAP with ATP,25,26
we propagated starting phage in host cells with a high-copy iA6
AP for 24 hours, followed by a 30:70 high-copy:very low-copy mixture of host cells for 12 hours (). At a dilution rate of 2.5 volumes per hour, phage survived a total dilution of 1039
-fold and experienced an average of ~45 rounds of evolution.
Figure 4 Continuous evolution of T7 RNAP variants that initiate transcription with A. (a) PACE schedule. (b) Activity in cells of T7 RNAP variants on the T7 and iA6 promoters isolated after 36 hours of PACE. Assays were performed as described in . Error (more ...)
The wild-type enzyme exhibited undetectable initial activity in cells (< 3%) on the iA6
promoter. All six clones isolated after only 36 hours of PACE exhibited at least 170% activity on the iA6
promoter in cell-based assays, while retaining at least 120% activity on the wild-type promoter (). Purified variants assayed in vitro
exhibited activities on the iA6
promoter matching that of wild-type T7 RNAP on the T7 promoter (Supplementary Fig. 8
). RACE analysis of transcripts produced by the most active clone (A6-36.4) confirmed that this enzyme begins transcripts with the template-directed bases on the iA6
, and wild-type promoters (Supplementary Fig. 9a
). All six characterized clones contained K93T, S397R, and S684Y mutations, while three of the six also contained S228A (Supplementary Table 2
). Residue 397 directly contacts the nascent RNA strand,28
suggesting a role for S397R in allowing efficient initiation of iA6
We concurrently evolved T7 RNAP to initiate transcripts with CTP (Supplementary Fig. 10a
). We observed that wild-type T7 RNAP retains significant activity on the iC6
promoter (~50%) both in cells and in vitro
(Supplementary Fig. 10b and 10c
), a surprising observation in light of reports that the enzyme initiates with G at the +2 position if the +1 position is C.25
While the high starting activity precluded large improvements, the most active PACE-evolved variants nevertheless exceeded 100% activity on the iC6
promoter both in cells and in vitro
(Supplementary Fig. 10b and 10c
and Supplementary Results
). RACE analysis of transcripts produced by the most active clone (C6-80.9) confirmed that this enzyme begins transcripts with the template-directed bases (Supplementary Fig. 9b
The three PACE experiments executed 45 to 200 rounds of evolution in 1.5 to 8 days and yielded T7 RNAP variants with activities on their target promoters or templates that exceed or match the activity of the wild-type enzyme transcribing the wild-type T7 promoter both in cells and in vitro. This degree of improvement is especially significant given that for two of the evolved activities, the starting polymerase exhibited virtually no detectable activity.
The evolved A6-36.4 variant of T7 RNAP can initiate transcription from iC6
, and wild-type templates in a template-directed manner with efficiencies comparable to that of wild-type T7 RNAP initiating with the wild-type template (Supplementary Fig. 11
) and sequence fidelity sufficient to mediate the production of functional pIII and LacZ enzyme. These findings suggest that this enzyme, and possibly other PACE-evolved variants, may represent improved, more general T7 RNA polymerases for routine in vitro
and in vivo
The PACE system can be assembled entirely from a modest collection of commercially available equipment (listed in Supplementary Table 3
) and does not require the manufacture of any specialized components. The ability to perform dozens of rounds of evolution each day with minimal researcher involvement implies that PACE is particularly well suited to address problems or questions in molecular evolution that require hundreds to thousands of generations, or the execution of many evolution experiments in parallel. More generally, PACE represents the integration and manipulation of many protein and nucleic acid components in a living system to enable the rapid generation of biomolecules with new activities, a significant example and goal of synthetic biology.