A protein-based self-replicating system has the advantage of connecting with our current biological systems. Detailed plans to construct protein-based self-replication from small molecule substrates by combining already-reconstituted, purified, biochemical processes for DNA/RNA/protein syntheses [
4] are essentially unchanged and under way. The proposal is to:
- identify the necessary genes,
- prepare efficient purified biochemical subsystems from the gene products,
- integrate the subsystems for self-replication () and
- encapsulate the system within a membrane to give a synthetic cell (“synthetic life”).
Of all the macromolecular components from
E. coli and its bacteriophages, only 151 were hypothesized to be sufficient for the MCP, constituting a minigenome of 113 kbp [
4]. Of these 151, it is striking that 96% are for protein synthesis and that there is considerable similarity in gene number and content and genome size to the recently-sequenced, extremely-metabolic-dependent, bacterial endosymbionts of insects (see above). An RNA/protein-based transcription/translation system has been reconstituted from purified components [
40], but the omission of DNA does not simplify the number of genes that ultimately will be necessary to encode the whole system for self-replication. Rather, it creates a new set of challenges unsolved in the modern world: production of a functional large RNA genome that avoids inhibitory double-stranded RNA structures and replicative mutations [
35].
Progress in step (i) has been rapid for
E. coli (but slow for
M. genitalium [
41]). Of the missing 1–4 key ribosomal RNA (rRNA) modification genes, 3 have just been discovered [
42–
44]. The gene for modifying transfer RNA (tRNA) A37 to t
6A has also been found and shown to be essential for
E. coli viability [
45]. This only leaves as little as one other gene to find, involved in modifying tRNA U34 to cmo
5U, with 2 genes in that pathway already known [
46]. Thus, reconstitution from purified components of every subsystem of the MCP is tantalizingly near. In an attempt to close perhaps the biggest remaining gap, we are over-expressing the 5 known key rRNA modification enzymes [
4] to test for activation of unmodified 23S rRNA transcripts necessary for synthesis of ribosomes
in vitro.
Less progress has been reported on steps (ii)–(iv). With regard to step (ii), though the
E. coli translation apparatus and ribosome were reconstituted separately from purified cellular components 3 decades ago, their translational accuracy is poorly characterized and
in vitro efficiencies of protein synthesis and ribosome turnover remain low in both purified and crude systems (). The break-even milestone for ribosomes making all of the proteins in the proposed minigenome [
4] is synthesis of ~35,000 peptide bonds by each ribosome (including 7491 peptide bonds for the ribosomal proteins). Towards the integration required for steps (iii) and (iv), bacterial transcription initiation has been reconstituted in a purified translation system [
47], purified DNA-dependent transcription and translation has been performed within liposomes [
48], and membrane proteins involved in phospholipid synthesis have been synthesized in active form in liposomes [
49]. But some of the other subsystems require unphysiological conditions that preclude integration. Simple systems for DNA replication require thermocycling and oligo primers (PCR or circle-to-circle amplification [
50]), while self-assembly of the
E. coli ribosome from natural components requires low and high Mg
2+ concentrations, high temperatures and long incubation times [
51]. Nevertheless, physiological conditions for
E. coli ribosome assembly have now been found and rRNA synthesis, ribosome assembly and translation () have been integrated under batch conditions (Jewett and Church, submitted). The next steps will be substitution of the
E. coli cells and extracts used for the macromolecule syntheses by purified subsystems.
| Table 1Protein yields and costs in cell-free, transcription and translation systems from E. coli. |
How might the efficiencies and utilities of purified systems be improved? There are some recent indications that adding genes not on the minimal list [
4] should help. Inclusion of translation elongation factors not present in PURE kits might improve efficiency and/or accuracy: EF-P facilitates formation of the first peptide bond by positioning fMet-tRNA
ifMet [
52], and LepA promotes back translocation of the mRNA-tRNA complex [
53,
54]. Comprehensive analysis of the individual effects of every
E. coli protein on purified translation showed that 344 (8%) were stimulatory [
55]. Most beneficial were ATP-dependent RNA helicase, HrpA, and trigger factor, increasing yields by ~80% and ~30%, respectively. More than 20 different auxiliary factors are thought to facilitate ribosome assembly, including chaperones, GTPases and helicases [
56]. For example, ATP-dependent RNA helicase, DbpA, has specificity for 23S rRNA [
57], and RimJ functions in ribosomal protein acetylation and in 30S subunit assembly [
58]. Choices for gene addition will be informed by studies such as the measurement of kinetic effects on 30S assembly of Era, RimM and RimP [
59]. Also, cytoplasmic mimicry has been shown to be a powerful guiding principle. Mimicking combined energy metabolism, oxidative phosphorylation and protein synthesis in crude extracts increased protein synthesis yields (; [
60,
61]). Activating natural energy metabolism in crude extracts reduces costs and suggests that incorporating metabolic modules [
62] into the MCP could further increase utility.
It should be emphasized that genes other than the 151 may ultimately prove necessary for self-replication and that, while the MCP would certainly be helpful in revealing their existence, such mystery genes would be hard to identify. Identification may proceed through traditional biochemical purifications from extracts or by modern high throughput genetic screens [
55]. Another challenge looming is how to achieve coordinated control of so many genes [
18].