|Home | About | Journals | Submit | Contact Us | Français|
Understanding the origin of life requires knowledge not only of the origin of biological molecules such as amino acids, nucleotides and their polymers, but also the manner in which those molecules are integrated into the organized systems that characterize cellular life. In this article, we introduce a constructive approach to understand how biological molecules can be arranged to achieve a higher-order biological function: replication of genetic information.
In a constructive approach, we aim to reconstitute a biological function, such as genome replication and protein translation, and ultimately fabricate an artificial cell from molecules purified and defined in vitro (Szostak et al. 2001; Deamer 2005). During the process, we can determine what conditions are sufficient to achieve the minimum set of biological functions required for cellular life. For instance, if we can reconstitute a given biological function from a set of defined molecules, we can conclude that the properties of those molecules are sufficient to accomplish that biological function. With regard to the origins of life, this represents a parallel and complementary approach to surveying possible routes from nonliving molecules to extant living systems. If it is difficult for us to reconstruct a biological function, it may have been correspondingly difficult for that function to evolve in a primitive living organism. Knowledge of which functions are difficult to assemble from existing biological molecules, and how such hurdles can be overcome, is expected to provide insights into the origin and evolution of multifunctional extant life.
In the field of synthetic biology, researchers are now constructing artificial networks to understand the “design principles” of biological systems, which is another expression for the “sufficient conditions” concept used here. Most current studies in synthetic biology incorporate modifications of existing cells, but some investigators are constructing artificial networks from defined molecules (Benner and Sismour 2005; Simpson 2006). This is similar to what we refer to here as a constructive approach (Kaneko 2006).
To gain insight into the origins of life, it is important to determine the nonbiological origin of chemical components, such as the amino acids, nucleotides, and lipids discussed in other articles on this topic. These are the small molecules of life that assemble into the proteins, nucleic acids and membranes that are essential for contemporary cells. Primitive versions of such molecules must have given rise to the first forms of cellular life by a process of self-assembly. However, if we observe a mixture of components from disrupted Escherichia coli containing all the molecules originally present in the living cells, no spontaneous regeneration of living cells takes place. It follows that molecules per se are necessary but not sufficient for life. Molecules and their functions must be coordinated in the correct order according to intrinsic chemical and physical rules, as observed by Schrodinger who famously described life as the “orderly and lawful behavior of matter” (Schrodinger 1944).
The primary aim of the constructive approach to protocellular life is to find sufficient conditions under which biological molecules assemble into systems that display higher-order biological functions, such as translation, replication of genetic information, cell growth, division, and nutrient transport. Some of these functions, including membrane growth (Walde 1994; Hanczyc et al. 2003), membrane growth coordinated with internal replication of genetic information (Chen et al. 2004), membrane transport of nutrients (Chakrabarti et al. 1994; Monnard and Deamer 2001; Fischer et al. 2002; Monnard et al. 2007; Mansy et al. 2008), and coupling of translation and nutrient transport (Noireaux and Libchaber 2004) have been reported previously. Here, we focus on the replication of genetic information, which is a fundamental characteristic of life.
One of the characteristics of life is the possession of genetic information, which replicates by using itself as a template, which we will refer to as “self-replication.” Several types of self-replicating systems have been constructed with bioinformational molecules. These include self-replication of DNA by polymerase chain reaction (PCR) (Saiki et al. 1985), self-sustained sequence replication (3SR) (Guatelli et al. 1990), self-replication of RNA by Qβ replicase (Mills et al. 1967; Biebricher et al. 1985; Oberholzer et al. 1995). Other examples include self-replication of peptides (Lee et al. 1996, 1997), low molecular weight compounds (Tjivikua et al. 1990), tetranucleotides (Zielinski and Orgel 1987) and other oligomers (Sievers and von Kiedrowski 1994), and most recently the ligation activity of ribozymes (Lincoln and Joyce 2009).
These self-replication reactions can be classified according to the reaction scheme shown in Figure 1. Type 1 is the self-replication reaction in which the information molecule (DNA or RNA) is replicated by an exogenous enzyme and includes PCR (Saiki et al. 1985), 3SR (Guatelli et al. 1990), and RNA replication (Mills et al. 1967; Biebricher et al. 1985; Oberholzer et al. 1995). Type 2 self-replication does not require any replication enzymes and these reactions include self-ligating ribozymes (Lincoln and Joyce 2009), self-replicating peptides (Lee et al. 1996; Lee et al. 1997), and other low molecular weight compounds (Zielinski and Orgel 1987; Tjivikua et al. 1990; Sievers and von Kiedrowski 1994). In this type of reaction, the information molecule replicates itself in a reaction catalyzed by its own activity, and therefore this is the simplest type of self-replication. Type 3 is a modification of type 1 in which the replication enzyme is supplied internally by synthesis rather than being exogenously added. The replication enzyme encoded in the information molecule is first decoded and then catalyzes replication of the original information molecule. The information molecule serves to provide the information required for protein production and the template for replication. The self-replication of the genome in cells or viruses would also be classified as type 3, because the replication enzyme is translated from the genomic DNA or RNA and then catalyzes replication of the genome. Construction of these types of self-replication systems has been reported previously (Ghadessy et al. 2001; Matsuura et al. 2002). In these reactions, however, translation and replication are separated temporally rather than occurring simultaneously. In the next section, we will describe characteristic features of each type of self-replication from the viewpoint of evolution.
If a self-replication process continues for many generations, random mutations can be introduced into the information molecule. This makes it possible for increasingly replicable mutants to appear and dominate the population, as first shown by Bartel and Szostak (1993). There are a number of ways this can occur, including enhancement of template activity (the ability to act as a replication template) and/or catalytic activity that promotes replication. The actual mechanism depends on the type of self-replication. When type 1 self-replication evolves, the template activity is enhanced, as shown previously (Mills et al. 1967). It is notable that although type 1 self-replication requires a replication enzyme, the replication activity does not evolve because the enzyme is not encoded on the information molecule and thus no mutations are introduced to improve the enzyme activity. Therefore, in the evolution of RNA self-replication by Qβ replicase, the template RNA initially encoding three genes were lost during evolution, resulting in a shorter, more rapidly replicating template (Mills et al. 1967).
In type 2 self-replication, a single information molecule has template and catalytic functions, thus both activities are able to evolve (Lincoln and Joyce 2009). In type 3 self-replication, even though the two activities (template activity of DNA or RNA and replication activity of replicase) are encoded on different molecules, both can evolve. This is because the replication catalyst is encoded in the information molecule so that mutations will be introduced. To achieve evolution of the replication enzyme in type 3 self-replication, other conditions are required: encapsulation of the reaction in a compartment with a small number of information molecules. These conditions are required to link information molecules with the encoding replicase so that the translated replication enzyme can interact with its origin information molecule. Without compartmentalization of a low number of information molecules, even if a highly active replicase arose because of mutation, the replicase would amplify the wild-type information rather than its own information containing the mutation (Szostak 1999; Szostak et al. 2001, Matsuura et al. 2002). This requirement for a small number of information molecules was recently shown by an experiment (Sunami et al. 2006) where we encapsulated two types of GFP genes, GFPuv5 and GFPuv2 (GFPuv5:GFPuv2= 0.85:0.15), into a liposome population having a diverse size range together with a cell-free transcription-translation system. GFPuv5 shows eightfold higher fluorescence signal than GFPuv2. Following translation, we collected liposomes showing high GFP fluorescence with a fluorescent-activated cell sorter (FACS) and investigated the ratio of the GFPuv5 gene to the GFPuv2 gene in the collected liposome population. We observed that the ratio was dependent on the liposome volume; within liposomes contained in a large volume (150 fL), the total number of genes per liposome was more than ten and the GFPuv5/GFPuv2 gene ratio was low (Fig. 2). In contrast, liposomes in a small volume (5–10 fL) were found to have nearly one gene per liposome and high GFPuv5/GFPuv2 gene ratio. These observations showed that the gene encoding a highly active protein was selected efficiently in the small liposome, which has a small number of genes.
Primitive life presumably involved type 2 self-replication because of its simplicity. Over time, the replication process would have become more complex by evolutionary selection and approach the type 3 self-replication systems used by extant life. Our purpose was to construct a type 3 self-replication system in which translation and replication occur simultaneously, and to optimize conditions under which type 3 self-replication can function efficiently. The encapsulated system incorporates an information molecule and translation machinery required to decode the information. The information molecule encodes a replication enzyme, which serves to replicate the information molecule.
Figure 3A shows a schematic representation of our type 3 self-replication system. The system consists of a template RNA as an information molecule and a reconstituted cell-free translation system (PURE system) as the decoding machinery, all of which were encapsulated in phospholipid vesicles (liposomes) (Kita et al. 2008). The RNA molecule encodes the catalytic subunit of RNA-dependent RNA polymerase (Qβ replicase), derived from an E. coli RNA phage, and has recognition sequences for the Qβ replicase at the termini. During the reaction, the Qβ replicase subunit is first translated from the template RNA and forms active Qβ replicase with EF-Tu and EF-Ts, which are elongation factors for translation and contained in the PURE system. The translated replicase then binds to the original template RNA (plus strand) and synthesizes the complementary RNA strand (minus strand). As the minus strand can also act as a template for the replicase, the RNA strand complementary to the minus strand (i.e., the plus strand) is synthesized in a similar manner. In this way, the information molecule, plus strand RNA, is self-replicated by the self-encoded replication enzyme. Additionally, to monitor self-replication by fluorescence, we introduced the β-galactosidase sequence into the minus strand. The β-galactosidase is translated after minus strand synthesis and catalyzes hydrolyzation of nonfluorescent 5-chloromethylfluorescein di-β-D-galactopyranoside (CMFDG; Invitrogen, USA) to yield green fluorescent 5-chloromethylfluorescein (CM-fluorescein).
The reaction system consists of 144 gene products (3 rRNAs, 46 tRNAs, 55 ribosomal proteins, 38 proteins), amino acids, other low molecular weight compounds and the template RNA (Table 1). This is a purified reconstituted system in which all of the components and their concentrations are defined. The number of components is amazingly large, yet this is one of the simplest encapsulated systems for carrying out protein translation and RNA replication. With regard to the origin of life, the first living systems would have had functionally identical translation and replication systems, but they must have been simpler and contained machinery for nutrient transport. The complexity of our system implies that extant translation machinery has become highly sophisticated during the evolutionary process.
The self-replication system was encapsulated in lipid vesicles prepared by the freeze-dried empty liposome method (Sato et al. 2005) using the phospholipid mixture, 1-Palmitoyl-2-oleoyl-sn-phosphatidylcholine (POPC):cholesterol: distearoyl phosphatidylethanolamine-polyethylene glycol 5000 (DSPE-PEG5000) at a molar ratio of 58:39:3). The liposomes were multilamellar and the internal volume was found to range from 1 to 100 fL with the most frequent volume about 4 fL. The internal volume was estimated from the fluorescence intensity of the red fluorescent protein, R-phycoerythrin (R-PE), encapsulated as a volume marker and measured by a FACS (Sato et al. 2006; Sunami et al. 2006).
The encapsulated self-replication reaction produces the minus strand, from which β-galactosidase was translated. The translated β-galactosidase hydrolyses the fluorogenic substrate to produce a green fluorescent product. Hence we could monitor the progress of the self-replication reaction by measuring green fluorescence. FACS analysis showed that the number of liposomes showing green fluorescence increased over time (Fig. 3). We defined the liposomes harboring a substantial amount of products as “reacted liposomes” (liposomes with green fluorescence larger than the dotted line in Fig. 3B). The frequency of the reacted liposome depended on the liposome volume. Statistical analysis showed that the frequency of the reaction occurring per unit volume was constant (0.013 per femtoliter) indicating that the frequency of self-replication was only 5.2% in the case of a typical 4 fL liposome. This implies low efficiency of self-replication in the liposome. The low efficiency is not because of the lack of components because even the smallest liposome at 1 fL is considered to contain all of the components and substrates in our system (Table 1). There are many possible reasons for the low efficiency including degradation of RNAs, inactivation of enzymes, accumulation of inhibitory products and competition between translation and replication. The most plausible possibility for type 3 self-replication is the last one: competition between translation and replication, which could lower reaction efficiency. In the next section, we describe evaluation of the effects of competition for our self-replication system.
Type 3 self-replication is characterized by the dual roles of the information molecule: the information for protein production and template for replication. If the two roles compete, the efficiencies of these reactions in the self-replication reaction become lower than what we expect for the individual activities of translation or replication. Occurrence of this effect arises from the nature of the ribosomes and replication enzyme. In this case, the ribosomes and Qβ replicase were presumed to compete because it was shown that if either is bound to an RNA molecule, the other cannot use the bound RNA as a template (Kolakofsky and Weissmann 1971). The existence of such competition implies that there is an optimum balance between translation and replication. That is, translation of replicase is required for minus strand synthesis, but excess translation by too many ribosomes inhibits minus strand synthesis. To evaluate this competition effect quantitatively, we used a kinetic model to describe part of the self-replication reaction (Fig. 4) (Ichihashi et al. 2008).
The kinetic model contains four components: plus strand RNA, minus strand RNA, RNA replicase (Qβ replicase) and ribosome. It carries out six reactions encompassing binding and dissociation of the plus strand RNA with RNA replicase and ribosome, translation, and minus strand synthesis. The binding and dissociation reactions are assumed to be in equilibrium. The forward reactions favor translation of RNA replicase (decoding processes). The downward reactions tend toward minus strand synthesis (replication processes). The competition effect is represented as the ternary complex of the plus strand with ribosome and replicase (Rep-Rib-P), which is incapable of translation and replication.
The kinetic model has four measurable parameters: dissociation constants for ribosome and replicase, and catalytic constants for translation and replication. Taking advantage of the reconstituted system, we varied the concentration of each component to estimate all four parameters. From the above model and parameters, we could predict an optimum ribosome concentration for minus strand synthesis because of the competition effect (Fig. 5). To examine this prediction experimentally, we varied ribosome concentration in a cell-free translation system, and measured the amount of synthesized minus strand by quantitative PCR after reverse transcription. The experimental results yielded a bell-shaped curve (Fig. 5) and the optimum ribosome concentration was close to the predicted value, indicating the validity of the kinetic model.
To summarize, we found that our self-replication system showed competition between translation and replication, and we were able to evaluate the effect quantitatively. Such a competition effect in type 3 self-replication is inevitable as long as the dual roles (template and information) are inherent in the information molecule which causes the mutual inhibition of the translation and replication machineries. However, optimizing the ribosome concentration minimizes the inhibitory effect, indicating that the balance between translation and replication is important for efficient type 3 self-replication.
The kinetic analysis provides another implication, with respect to possible size of an evolvable artificial cell. The kinetic analysis revealed that the dissociation constant of the RNA with the replicase was about 20 nM (Ichihashi et al. 2008), indicating that the concentration of RNA should be at more than nanomolar levels for efficient reaction. This requirement limits the possible size of an evolvable artificial cell, considering that the information molecule (RNA in this case) exists in low number for evolution as described above. For example, one information molecule corresponds to approximately 1 nM in a 1 µm cell, but 1 pM in 10 µm cell. Therefore, the size of an evolvable artificial cell including the self-replication system should be 1 µm order for efficient internal RNA replication. This notion implies that the possible size of an evolvable cell would be limited by the affinity of internal components.
In this article, we present our recent investigation of a type 3 self-replication system and show the importance of the translation/replication balance. However, with the optimum ribosome concentration, self-replication in liposomes is still inefficient. That is because the system has more than a hundred components and each molecule in the system does not function efficiently, probably because of unexpected interactions such as competition. Further studies are required to determine the conditions under which all components function in a coordinated fashion to achieve efficient self-replication (Pohorille and Deamer 2002).
How do we find these coordinated conditions? One approach is to adopt an evolutionary strategy, mutating the RNA of the self-replication system and selecting mutants showing greater replication. For instance, the selective process could be enabled by selecting for liposomes with higher levels of fluorescence by FACS. This type of evolutionary strategy is also likely to have been adopted by primitive cells, which would need to acquire new functions to replicate efficiently in different environments. The new function or functions acquired by an ancient/primitive cell could sometimes cause a conflict with pre-existing functions. To resolve this conflict, a mutant would evolve such that it would be able to coordinate new functions with pre-existing ones. This conflict resolution process is the same evolutionary strategy that we aim to emulate. Therefore, the construction and improvement of model self-replication systems by evolutionary strategies will provide a deeper understanding of the origin of coordinated biological systems.
Editors: David Deamer and Jack W. Szostak
Additional Perspectives on The Origins of Life available at www.cshperspectives.org