De novo designed sequences rescue single-gene knockout strains of E. coli
Previous work in our laboratory showed that several purified proteins from our binary patterned alpha-helical libraries can bind a cofactor and perform enzyme-like functions, including peroxidase, lipase, and esterase activities 
. These findings led us to question whether proteins from this library might also provide biological functions that enable cell growth. To address this question, we tested the ability of binary patterned proteins to rescue strains of E. coli
in which a conditionally essential function had been deleted. These strains were obtained from the Keio collection, which contains all the viable single-gene knockout strains of E. coli 
. We tested 27 auxotrophs that grow on rich media, but fail to grow on minimal media (see legend for ). Each auxotroph was transformed with a library of synthetic genes carried on the expression vector pCA24NMAF2 
. Typically, we obtained 5–10 million transformants, thereby ensuring reasonably deep coverage of the library of 1.5×106 de novo
sequences. As negative controls, each auxotroph was transformed with the empty vector or the same vector expressing beta-galactosidase. Transformed cells were spread either on LB (rich) or on M9-glucose (minimal) media supplemented with isopropyl-beta-D-thiogalactoside (IPTG) to induce expression, and the formation of colonies was monitored ( & ).
Complementation frequency and time required for colony formation on selective media (M9-glucose).
Rescue of E. coli auxotrophs by de novo proteins.
On rich media, all transformed cells grew, regardless of whether they received the control lacZ gene or a gene encoding a novel protein. As expected, on minimal media, auxotrophs transformed with the control plasmids failed to grow. (If the negative controls grew, the strain was considered ‘leaky’ and not used for further studies.) In most cases, auxotrophs transformed with the collection of novel sequences also failed to produce colonies, even after weeks of incubation. This indicates that proteins from the collection could not rescue these strains. However, for four auxotrophs – ΔserB, ΔgltA, ΔilvA, and Δfes – colonies formed on minimal plates after several days of incubation (). Growth of these cells under selective conditions suggests that a novel gene carried by the plasmid complements the deletion. The rates of complementation and the times required for colony formation are summarized in .
To confirm that the colonies observed on selective media resulted from the uptake of a novel gene and not from an adaptive mutation on the chromosome, colonies were isolated by restreaking, and plasmid DNA was purified; this DNA was then used to transform naïve cells of the same auxotroph, and the transformed cells were once again spread on both LB and minimal plates. Growth of approximately the same number of colonies on selective plates as on rich plates indicates that the observed phenotype (growth on minimal) transferred with the genotype (the plasmid carrying the novel gene). These results are shown in and Figure S1
. To ensure that rescue was due to a de novo
protein, and not some natural sequence that might have been picked up inadvertently during plasmid constructions, we isolated the ~300 base pair fragment encoding the novel protein, recloned it into a new vector, and showed that the new clone also rescued the deletion strain.
The binary patterned sequences that rescue the four auxotrophs are shown in . (Additional sequences are shown in Figure S2
.) The novel sequences are designated according to the auxotroph they rescue, followed by a number (e.g. Syn-Fes-1 is a syn
thetic sequence that rescues Δfes
.) For the ΔgltA
strain, we isolated one novel protein that enabled cell growth. However, in the other cases, several sequences were isolated, suggesting that deletions of these genes are relatively easy to rescue. As shown by the color-coding in , the sequences of the biologically active proteins conform to our binary pattern for the design of 4-helix bundles. (As is often true for sequences in combinatorial libraries, spurious mutations were observed occasionally, but these do not interfere with the overall binary pattern – see Figure S2
). NCBI-BLAST searches indicate the sequences of the de novo
proteins are unlike those of any known natural proteins 
Designed amino acid sequences that enable growth of E. coli auxotrophs on selective media.
The ability of each de novo
protein to sustain cell growth was measured in cultures grown in minimal M9-glucose liquid media. Growth rates were compared for deletion strains expressing a de novo
protein versus the same strains expressing either the negative control (LacZ) or the corresponding natural protein on the same pCA24NMAF2 vector. As shown in , the novel proteins enable cell growth under conditions where auxotrophs transformed with the control plasmid fail to grow at all. Cells relying on the de novo
proteins grow significantly slower than those expressing the natural protein. (Fes is an exception because overexpression of E. coli
Fes is toxic [14, Genobase ORF JW0576 - http://ecoli.naist.jp/GB6/info.jsp?id=JW0576
].) It is not surprising that the unevolved novel sequences function at substantially lower levels than natural sequences selected by billions of years of evolution. Indeed, the relatively slow rates of cell growth enabled by these first-generation de novo
sequences suggest that selections for faster growth might facilitate the evolution of more active proteins. Such experiments will provide an opportunity to test whether de novo
progenitor sequences might lead to novel evolutionary trajectories.
Growth of auxotrophic strains of E. coli in selective liquid media.
The results shown in and demonstrate that the de novo sequences enable cell growth; however, viability per se does not indicate that the novel proteins provide the same biochemical activities as the deleted natural proteins. Therefore, we devised a series of experiments to probe the functions of the de novo proteins.
Biological functions of the de novo proteins
The natural genes deleted in these four auxotrophs encode a range of functions (9). These are summarized below and depicted graphically in Figure S3
- serB encodes phosphoserine phosphatase, responsible for the final step in serine biosynthesis: phosphoserine → serine + phosphate.
- gltA encodes citrate synthase, which catalyzes an early step in glutamate biosynthesis: oxaloacetate + acetyl-coA → citrate → cis-aconitate → isocitrate → alpha-ketoglutarate → glutamate.
- ilvA encodes biosynthetic threonine deaminase, which catalyzes the first step in the production of isoleucine from threonine: threonine → 2-ketobutyrate + ammonia → → → isoleucine.
- fes encodes enterobactin esterase, which cleaves the iron-bound enterobactin siderophore, thereby enabling cells to acquire iron in iron-limited environments .
In principle, the novel proteins could rescue these auxotrophs either by providing the same function as the deleted enzyme, or by one of the three alternative mechanisms described below:
First, we considered the possibility that an auxotroph in a biosynthetic pathway might be rescued by a de novo
sequence that produces the end product via a novel bypass pathway. Although it seems unlikely that small, unevolved proteins could catalyze novel biosynthetic pathways, we nonetheless tested this possibility. If the novel sequences that rescue ΔserB,
encode bypass pathways for the synthesis of serine, glutamate, and isoleucine, respectively, then these de novo
sequences would also rescue cells deleted for enzymes that function at other steps in the these biosynthetic pathways. (Pathways are shown in Figure S3
). This possibility was tested by the following three experiments:
- De novo sequences that rescued ΔserB were transformed into ΔserC cells, which are deleted for phosphoserine aminotransferase. This enzyme converts phosphohydroxypyruvate to phosphoserine in the step prior to that catalyzed by the serB encoded enzyme. (Because the serC gene product is also involved in the biosynthesis of pyridoxine , the selective media for this experiment was supplemented with pyridoxine.)
- The sequence that rescued ΔgltA was transformed into Δicd cells. This strain is also a glutamate auxotroph because it cannot catalyze the conversion of isocitrate to alpha-ketoglutarate, the direct precursor of glutamate.
- The sequences that rescued ΔilvA were transformed into ΔilvC and ΔilvD cells, which are deleted for isomeroreductase and dihydroxyacid dehydratase, respectively. Both these enzymes function downstream of ilvA in the biosynthesis of isoleucine. (Because the ilvC and ilvD gene products are also involved in valine biosynthesis , the selective media for these experiments was supplemented with valine.)
In all cases, the de novo sequences did not rescue E. coli cells deleted for functions at other steps in the biosynthetic pathways. This demonstrates that the novel proteins do not enable bypass pathways for the synthesis of their respective end products.
Second, we considered the possibility that our de novo
sequences might rescue the auxotrophs by altering the expression or activity of an endogenous E. coli
protein. To assess this possibility, we relied on an exhaustive screen by Patrick et al.
, who identified natural genes whose overexpression can rescue the deletion of noncognate genes in E. coli
(i.e. ‘multicopy suppressors’) 
. By screening the complete set of overexpressed E. coli
ORFs, they found the following multicopy suppressors of our four strains: ΔilvA
can be rescued by overexpression of tdcB 
, or emrD
can be rescued by overexpression of gph, hisB,
; and Δfes
can be rescued by overexpression of thiL
was not rescued by overexpression of any E. coli
genes. To probe whether our novel sequences rely on these E. coli
proteins for rescue, we tested whether our proteins rescue the following double deletion strains: ΔilvA
. Transformation with the appropriate novel sequences showed that all five of these double knockouts were rescued by all of the corresponding artificial sequences listed in . The ability of our novel sequences to rescue these double deletions shows that the artificial proteins do not
function by altering the expression or activity of E. coli
proteins known previously to enable rescue.
The double mutant ΔfesΔthiL cannot be constructed because deletion of thiL is lethal. The ΔserBΔhisB strain was constructed, but since ΔhisB is itself an auxotroph, our novel sequences did not rescue this strain (successful rescue would require the unlikely occurrence of a single de novo protein replacing the activities of two conditionally essential genes.) Therefore, dependence on thiL and hisB could not be assessed. In addition, it remains possible that some other E. coli protein, which escaped selection by Patrick et al., could act as a partner in the rescue mediated by our de novo sequences. With the exception of these caveats, our results demonstrate that the known multicopy suppressors are not required for the biological activities of our de novo proteins.
Third, we considered the possibility that the de novo
proteins might rescue the deletion strains by a mechanism that does not depend on the specific sequences (), but instead involves global alterations in metabolism that are induced by the mere expression of foreign genes. For example, although our proteins were designed to fold into alpha-helical bundles, we considered the possibility that sequences isolated by our selections might be unfolded, and thereby induce a cellular stress response. To assess folding, we purified several proteins and measured their circular dichroism spectra. These spectra (shown in Figure S4
) demonstrate the structures are predominantly alpha-helical, and similar to the spectra of designed 4-helix bundles solved previously by NMR 
. Thus, rescue is not caused by unfolded sequences inducing a stress response.
We also note that if a generic stress response were responsible for rescue, one would expect all the Syn proteins to rescue all the deletions. This is not the case. Syn-Fes does not rescue ΔgltA, Syn-SerB does not rescue ΔilvA, and so on. Thus, specific de novo sequences mediate the rescue of specific chromosomal deletions.
To demonstrate explicitly that rescue depends on a specific sequence, rather than a generic cellular response to the expression of foreign genes, we mutated one of the de novo
proteins. The design of this mutant was based on an analysis of the common features among the de novo
sequences that rescued ΔilvA
. Alignment of the Syn-IlvA sequences revealed two conserved polar residues: Glu36 and Lys42. We constructed the Lys42»Ala mutation in Syn-IlvA-1, and found that although this protein was expressed at the same level as the parental sequence, it fails to rescue ΔilvA.
(These results are shown in Figure S5
) Thus, rescue is not simply due to expression of an exogenous gene; it is mediated by sequence-specific features of Syn-IlvA-1.
Fes differs from the other deletions because it is not involved in a biosynthetic pathway. Fes functions in iron acquisition. E. coli
secretes the enterobactin siderophore (MW, 670 Da) into the media, where it binds iron, and is transported back into the cell. Because the affinity of enterobactin for iron is extremely high (KD
M), release of the metal requires degradation of the siderophore. This is catalyzed by the Fes protein, enterobactin esterase 
. The impact of the de novo syn-fes
sequences on iron acquisition is dramatic: Elemental analysis shows that cells expressing the Syn-Fes proteins accumulate 6- to 10-fold more iron than control cells (Figure S6
). In principle, the de novo
proteins could rescue Δfes
either by functioning as an esterase or by some alternative mechanism. For example, the de novo
proteins could enable iron acquisition by being transported into the media, binding iron with high affinity, being re-transported back into the cell, and then releasing the tightly bound iron intracellularly. Although possible, this alternative mechanism seems unlikely for novel proteins that occur so frequently in a semi-random library.
Irrespective of whether the de novo
proteins rescue auxotrophs by providing the same function as the deleted enzyme or by some alternative mechanism, we expected the artificial sequences – which were neither selected by evolution (in vivo
or in vitro
), nor explicitly designed for enzymatic activity – to have far lower levels of activity than naturally evolved sequences. To estimate this level of activity in vivo
, we compared the growth rates of deletion strains expressing a high level (400X) of Syn-IlvA-1 to the same strain expressing a low level (1X) of the natural protein, encoded by IlvA. These experiments showed that even when the de novo
protein is expressed at ~400-fold higher levels than the natural protein, cells grow much more slowly. (Growth rates are shown in Figure S7
.) As expected, the de novo
protein exhibits a very low level of biological activity.
Activity levels that are barely sufficient to sustain slow cell growth may not be detectable in vitro
. Indeed, others have reported that although overexpression of one E. coli
protein can sometimes rescue the deletion of another protein in vivo
, these ‘moonlighting’ activities can be so low that they cannot be detected in vitro 
. Despite these concerns, we attempted to assay biochemical activities in vitro
using both cell lysates and purified proteins. Lysates are easy to prepare and are more likely to contain molecular partners (proteins, cofactors, or metals) present in vivo
. The disadvantage of lysates, however, is that background activities may obscure the low-level activities of the de novo
proteins. We assayed for phosphoserine phosphatase, enterobactin esterase, threonine deaminase, and citrate synthase activities in lysates from cells expressing the respective de novo
proteins. In several cases, activity was observed, however, lysates from control cells also showed low levels of activity. This was not surprising, particularly for the Fes and SerB activities, since previous work in our laboratory showed that E. coli
lysates contain nonspecific esterases and phosphatases 
We also purified several of the de novo proteins. (To avoid contamination by the natural enzyme, purifications were from strains deleted for the natural gene.) We tested these purified proteins for the enzymatic activities deleted in the respective auxotrophs, but were unable to detect activity that was reproducibly above the controls. There are several reasons why such experiments might not demonstrate activity: (i) As noted above, the novel proteins have extremely low levels of activity. (ii) The de novo proteins may require cofactors. We do not know which cofactors might be required, as the novel proteins may use different cofactors than the natural enzymes. (iii) The novel proteins may function by partnering with an E. coli protein. For example, the specificity of an endogenous hydrolase might be altered by binding one of our helical bundles. If this were the case, activity would not be observed in preparations containing only the purified de novo protein. (iv) The Syn protein may function by a different mechanism than the deleted protein, and novel activities would not be detected in experiments designed to assay the natural enzyme.
While we do not yet know the precise mechanisms by which the novel proteins rescue these four deletions in E. coli, we have ruled out several alternative mechanisms including (i) bypass pathways, (ii) activation of known endogenous suppressors, and (iii) induction of a generic stress response (see above). The slow growth rates of cells relying on the novel proteins for survival indicate that regardless of the actual mechanism, the de novo proteins have very low levels of activity — as expected for sequences that were neither designed nor evolved for function. Irrespective of whether the de novo proteins catalyze the same biochemical reaction as the deleted natural protein, or function by some alternative mechanism (and whether they act directly or through interactions with endogenous proteins), elucidating the molecular basis of auxotroph rescue will enhance our understanding of the minimal functions necessary for cell growth.