|Home | About | Journals | Submit | Contact Us | Français|
Minimal cells comprise only the genes and biomolecular machinery necessary for basic life. Synthesizing minimal and minimized cells will improve understanding of core biology, enhance development of biotechnology strains of bacteria, and enable evolutionary optimization of natural and unnatural biopolymers. Design and construction of minimal cells is proceeding in two different directions: “top-down” reduction of bacterial genomes in vivo and “bottom-up” integration of DNA/RNA/protein/membrane syntheses in vitro. Major progress in the last 5 years has occurred in synthetic genomics, minimization of the Escherichia coli genome, sequencing of minimal bacterial endosymbionts, identification of essential genes, and integration of biochemical systems.
Design-based engineering of biological systems (also known as synthetic biology) tests understanding of the living world and harnesses its diverse repertoire to solve society’s problems [1,2]. Ideally, an engineered system should be functionally robust and predictable. Yet these features are difficult to achieve when engineering biology  because of the poorly understood complexity of even the simplest single-celled organisms. An enticing way to simplify cellular complexity, test understanding and potentially facilitate engineering is to synthesize minimal cells [4–7]. Forster and Church reviewed plans of others to minimize small bacterial cells (in vivo “top-down” approach)  and proposed detailed plans for synthesizing a minimal cell from biomolecular parts (in vitro “bottom-up” approach) . Here, we highlight progress, challenges and prospects since these two reviews.
Minimal cells require minimal genomes, and minimal genomes require design, construction and manipulation tools at an unprecedented scale. Great progress has been made in genome construction by the J. Craig Venter Institute (JCVI; Rockville, MD, USA). JCVI constructed the 582 kilobase pair (kbp) genome of Mycoplasma genitalium, the smallest known genome of a bacterium capable of independent growth . This was done by commercial gene synthesis from oligodeoxyribonucleotides (oligos) and then stepwise assembly. Assemblies of up to quarter genomes were cloned in vitro in E. coli bacterial artificial chromosomes, while the final assembly used recombination in the yeast Saccharomyces cerevisiae . JCVI further improved the technology by enzymatic assembly of genes in vitro  and by discovering that yeast has the remarkable capability of simultaneously recombining 25 overlapping DNA fragments to make the complete M. genitalium genome . More recently, JCVI has developed methods for manipulating and cloning whole genomes in yeast  and shown synthesis of a larger 1.08 million base-pair M. mycoides JCVI-syn1.0 genome . Though sequencing has become inexpensive, the costs of chemically synthesizing genes have leveled out at ~$0.50/bp which is prohibitive at the genome scale for typical researchers. More affordable genetic segments may be obtained from native genomes by restriction digestion or PCR-amplification, which may limit sequence design, or by improved methods for assembling genes from oligos [14,15].
In contrast to genome construction, non-viral genome design and manipulation are still primitive and certainly cannot be done from scratch. For example, substantial changes in whole bacterial genomes essentially have been limited to conservative deletions (see below and ), programming microbes for expression of the anti-malarial drug artemisinin has taken 150-person years of work , and coordinated over-expression of multiple proteins in a single cell is difficult to achieve . Optimization and discovery of new designs will be helped by directed evolution technologies such as multiplex automated genome engineering (MAGE; ). MAGE generates genomic diversity in E. coli by parallel, oligo-directed, genomic modifications.
Even the most highly reduced genome of M. genitalium contains 100 individually dispensable genes out of 528 annotated genes , so streamlining down to only essential genes is one route to minimal cells. So far, significant minimization has been carried out only in organisms with larger genomes such as E. coli (4,640 kbp; 4,434 genes) and Bacillus subtilis (4,216 kbp; 4,245 genes) aided by known sequences of closely-related genomes. Genome reduction by up to 30% has proven surprisingly successful for viability, genome stabilization , promoting growth  and enhancing recombinant protein production . Rather than targeted deletion, genome reductions of up to 200 kbp can also result from experimental evolution .
The smallest minimized genomes from the top-down approach will likely be produced by minimizing the already smallest genome, that of M. genitalium. JCVI has been pursuing this plan in 6 ambitious steps:
JCVI has completed steps (i)–(iv). Step (iv) was particularly challenging because of slow growth rates and because bacterial genomes engineered in yeast have DNA restriction/modification systems that are incompatible with the Mycoplasma host cell . To learn how to transplant and express chemically synthesized genomes (iv), JCVI “booted-up” a synthetic, essentially wild-type, computer-specified, Mycoplasma mycoides genome (1,080 kbp) in a closely related cell to yield “Synthia” . This technological milestone marks the dawn of “synthetic genomics” and will undoubtedly accelerate the engineering of microbial factories, once costs are significantly lowered, producing fuels, pharmaceuticals, chemicals, and novel biomaterials (see Prospects for Biotechnology). Notwithstanding the importance of this achievement, it should not be overinterpreted as synthesis of a cell or life, as standard usage of “synthetic” would imply either cell-free synthesis of the whole cell (rather than its genome) or generation of something very unnatural (rather than a genetically-modified organism). The published plan for steps (v) and (vi) is to synthesize a M. genitalium-based genome lacking all dispensable genes to boot up a “Mycoplasma laboratorium” cell (last paragraph of ref. ). However, though virtually all genes that are individually dispensable in M. genitalium have been determined, it is recognised that a major hurdle is synthetic lethals (i.e. non-viable cells when two individually viable mutations are combined ).
How may cellular complexity and synthetic lethality be circumvented to allow top-down production of a minimal genome? One route is step-wise deletion of the 100 individually dispensable genes, perhaps aided by directed evolution . However, the number of combinations is astronomical, rational choice of combinations is limited by poor understanding (e.g., the functions of one fifth of the genes of M. genitalium remain to be determined), and considerably less than 100 of the 525 genes are likely dispensable in combination. There will also be multiple different minimal genome “solutions,” depending on the temporal order of deletion. Nevertheless, this will teach us much about redundancy in biology.
A second route is evident from tables of M. genitalium genes involved in the core replicative functions of DNA, RNA and protein syntheses : these genes are in the minority, with the majority of M. genitalium genes being involved in functions such as metabolism of small molecules. Thus, if additional nutrients were supplied in the extracellular medium (and perhaps their uptake aided by encoding extra transmembrane transporters) it may be feasible to delete many more genes. This could take us down to a truly minimal, protein-coding cell: one sufficient for replication but not for metabolism of most small molecules.
Interestingly, development of such extreme metabolic dependence without loss of genetic independence may have already occurred in the reductive evolution of the intracellullar bacterial endosymbionts of insects . These recently-sequenced symbiont genomes include the smallest non-organellar, non-viral genomes, Carsonella ruddii (160 kbp; 213 genes ) and Hodgkinia cicadicola (144 kbp; 188 genes ). In contrast to mitochondrial and chloroplast evolution, there is no evidence so far of gene transfer from bacterial symbiont to host . Almost all of the core replicative functions have been predicted computationally to reside in the symbiont genome, although notable exceptions are several essential tRNAs and aminoacyl-tRNA synthetases [27,30]. Ultimate proof of genetic independence can only come from development of a defined in vitro system for replication of either a bacterial symbiont or a derivative engineered to encode any missing essential genes. Such experimental verification would constitute our third envisioned top-down route to a minimal genome.
As simple as these minimal cells may seem, it is worth noting that “there is no such thing as a simple bacterium” . Mycoplasma pneumonia (only 816 kbp and 733 predicted genes) was recently found to have an unanticipated complexity that is humbling. Many genes have multiple modes of transcription and complicated regulation , the proteome has a similar organization to more complex organisms , and even metabolic enzymes perform multiple functions . Furthermore, there is no rapid or systematic method for determining the functions of the large numbers of genes of unknown function in any organism, minimal or otherwise.
The alternative direction to a minimal cell is bottom-up: synthesizing self-replication by pooling together essential purified biological macromolecules, their genes and their small molecule substrates . By this approach, cellular overhead including genes of unknown function can be removed, the system can be readily manipulated and tuned, and all of the components can be defined. One possibility is a DNA/RNA/protein system derived from the core replication machinery of today’s simplest cells. The other possibilities are ribonucleoprotein and RNA-alone systems modeling cells presumed to have existed billions of years ago .
A self-replicating system made solely from RNA  has the advantage of avoiding altogether the complexity of protein synthesis. Indeed, the milestone of self-sustained replication of an RNA enzyme in the absence of protein was just reached using pre-synthesized half-enzymes as substrates for ligation . But this system cannot synthesize the half-enzyme substrates which are huge compared with natural small-molecule substrates and which contain all the informational content of the replicating system. A ribozyme selected from random sequences to polymerize nucleoside triphosphates on an RNA template was published 14 years ago  and its 3-dimensional structure just solved . Yet the difficulty in developing this polymerase capable of adding only 14 nucleotides indicates that evolving it or random sequences in vitro into an RNA replicase is distant.
A protein-based self-replicating system has the advantage of connecting with our current biological systems. Detailed plans to construct protein-based self-replication from small molecule substrates by combining already-reconstituted, purified, biochemical processes for DNA/RNA/protein syntheses  are essentially unchanged and under way. The proposal is to:
Of all the macromolecular components from E. coli and its bacteriophages, only 151 were hypothesized to be sufficient for the MCP, constituting a minigenome of 113 kbp . Of these 151, it is striking that 96% are for protein synthesis and that there is considerable similarity in gene number and content and genome size to the recently-sequenced, extremely-metabolic-dependent, bacterial endosymbionts of insects (see above). An RNA/protein-based transcription/translation system has been reconstituted from purified components , but the omission of DNA does not simplify the number of genes that ultimately will be necessary to encode the whole system for self-replication. Rather, it creates a new set of challenges unsolved in the modern world: production of a functional large RNA genome that avoids inhibitory double-stranded RNA structures and replicative mutations .
Progress in step (i) has been rapid for E. coli (but slow for M. genitalium ). Of the missing 1–4 key ribosomal RNA (rRNA) modification genes, 3 have just been discovered [42–44]. The gene for modifying transfer RNA (tRNA) A37 to t6A has also been found and shown to be essential for E. coli viability . This only leaves as little as one other gene to find, involved in modifying tRNA U34 to cmo5U, with 2 genes in that pathway already known . Thus, reconstitution from purified components of every subsystem of the MCP is tantalizingly near. In an attempt to close perhaps the biggest remaining gap, we are over-expressing the 5 known key rRNA modification enzymes  to test for activation of unmodified 23S rRNA transcripts necessary for synthesis of ribosomes in vitro.
Less progress has been reported on steps (ii)–(iv). With regard to step (ii), though the E. coli translation apparatus and ribosome were reconstituted separately from purified cellular components 3 decades ago, their translational accuracy is poorly characterized and in vitro efficiencies of protein synthesis and ribosome turnover remain low in both purified and crude systems (Table 1). The break-even milestone for ribosomes making all of the proteins in the proposed minigenome  is synthesis of ~35,000 peptide bonds by each ribosome (including 7491 peptide bonds for the ribosomal proteins). Towards the integration required for steps (iii) and (iv), bacterial transcription initiation has been reconstituted in a purified translation system , purified DNA-dependent transcription and translation has been performed within liposomes , and membrane proteins involved in phospholipid synthesis have been synthesized in active form in liposomes . But some of the other subsystems require unphysiological conditions that preclude integration. Simple systems for DNA replication require thermocycling and oligo primers (PCR or circle-to-circle amplification ), while self-assembly of the E. coli ribosome from natural components requires low and high Mg2+ concentrations, high temperatures and long incubation times . Nevertheless, physiological conditions for E. coli ribosome assembly have now been found and rRNA synthesis, ribosome assembly and translation (Fig. 1) have been integrated under batch conditions (Jewett and Church, submitted). The next steps will be substitution of the E. coli cells and extracts used for the macromolecule syntheses by purified subsystems.
How might the efficiencies and utilities of purified systems be improved? There are some recent indications that adding genes not on the minimal list  should help. Inclusion of translation elongation factors not present in PURE kits might improve efficiency and/or accuracy: EF-P facilitates formation of the first peptide bond by positioning fMet-tRNAifMet , and LepA promotes back translocation of the mRNA-tRNA complex [53,54]. Comprehensive analysis of the individual effects of every E. coli protein on purified translation showed that 344 (8%) were stimulatory . Most beneficial were ATP-dependent RNA helicase, HrpA, and trigger factor, increasing yields by ~80% and ~30%, respectively. More than 20 different auxiliary factors are thought to facilitate ribosome assembly, including chaperones, GTPases and helicases . For example, ATP-dependent RNA helicase, DbpA, has specificity for 23S rRNA , and RimJ functions in ribosomal protein acetylation and in 30S subunit assembly . Choices for gene addition will be informed by studies such as the measurement of kinetic effects on 30S assembly of Era, RimM and RimP . Also, cytoplasmic mimicry has been shown to be a powerful guiding principle. Mimicking combined energy metabolism, oxidative phosphorylation and protein synthesis in crude extracts increased protein synthesis yields (Table 1; [60,61]). Activating natural energy metabolism in crude extracts reduces costs and suggests that incorporating metabolic modules  into the MCP could further increase utility.
It should be emphasized that genes other than the 151 may ultimately prove necessary for self-replication and that, while the MCP would certainly be helpful in revealing their existence, such mystery genes would be hard to identify. Identification may proceed through traditional biochemical purifications from extracts or by modern high throughput genetic screens . Another challenge looming is how to achieve coordinated control of so many genes .
Minimal cell syntheses are still in their formative stages where the main rewards are new molecular tools and a better understanding of the core genetic and biochemical systems necessary for basic life. But applications in biotechnology are close at hand. Based on the improved stability, growth and protein production of E. coli and other biotechnology workhorses upon reducing their genomes [21–23], further minimized strains should replace most current commercial bacterial strains. Biotech applications of reduced-genome M. genitalium are less clear because of its fragility and much slower growth rate (doubling time in culture of 12 h). However, M. genitalium has the advantage of having the smallest genome, facilitating synthesis of variant genomes, and it is conceivable that its limitations might be addressed by synthetic genomics. Synthetic genomics will be particularly helpful for redesigning microbes for which genetic tools are poor.
The MCP mostly involves synthesis and optimization of purified translation systems. Such systems have a number of advantages over alternative methods of protein synthesis such as lack of RNases/proteases/inclusion bodies, high compatibility with cytotoxic proteins, flexibility of incorporation of unnatural amino acids, ease of product purification, and direct control of reaction conditions. The main hurdle preventing application of the PURE system in biotechnology is the high cost (Table 1) due to its production from >30 different fermentations. To address this limitation, we are developing a cost-effective method for over-expressing the entire system in a single E. coli cell followed by single batch purification [18,19,63]. Selection of variant 23S rRNAs for improved unnatural amino acid incorporation  could be uncoupled from cell viability by synthesizing ribosomes in vitro; such variants would facilitate the directed evolution of peptidomimetic drug candidates .
In conclusion, significant progress has been made in both the top-down and bottom-up approaches to minimal cells in the last 5 years. Both approaches are providing new tools, fundamental biological knowledge and potential biotech applications distinct from those garnered from other fields. Though major challenges lie ahead, the era of biology by design has begun.
We are grateful to George Church for advice and comments on the manuscript and John Glass, John McCutcheon and Michael Sismour for comments on the manuscript. This work was supported by the National Institutes of Health and National Academies Keck Futures Initiative (to MCJ and ACF), the National Science Foundation (to MCJ), and the American Cancer Society (to ACF).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
• Of special interest
•• Of outstanding interest