HPdV carried out all analyses and simulations and wrote the text.
Title: Bioenergetic origin of the genetic code
Version: 2 Date: 27 June 2011
Reviewer: William Martin
Reviewer number: 1
Review of Vladar for BD
The author suggests that there is a connection between the structure and evolution of the genetic code with the Stickland reaction, a fermentation reaction in anaerobes like Clostridia and relatives.
I would like to start by clarifying that I do not claim that the extant pathways of the Stickland reaction are directly derived from the putative ancient pathways for amino acid fermentation, as discussed in the main text. But the existence of these pathways in extant organisms make amino acid fermentation plausible.
"The proposition is that the origin of the genetic code traces back to the origin of metabolism", that is a nice thought but Copley et al. (2005) should developed that previously, but from a very different angle.
The mechanisms presented by Copley et al (2005) [2
] predict that several of the amino acids can be synthesised from their corresponding keto acids in dinucleotide complexes. The dinucleotides match the first two codons of the amino acids in the code; this provides a possible mechanism for the coevolution theory of the code [62
], and to the code within the codon [61
]. Personally, I regard the biosynthesis of amino acids as a later stage in the evolution of the code. This is particularly appealing from Copley et al's (2005) [2
] ideas; the amino acids, synthesised in dinucleotides corresponding to codons, would be transferred to adapters in a synthetase-like reaction. This would necessarily force codon swapping and reorganisation of an existing proto-code. But at this point the function of the amino acids would have departed from bioenergetics, otherwise we would have a circular argument and a perpetuum mobile
On the topic of amino acids and bioenergetics, Amend & Shock (1998)
]show that the synthesis of amino acids (and proteins) from H2 and CO2 and ammonium is thermodynamically favourable under hydrothermal vent conditions. Amend and McCollom (2009)
]show that the synthesis of cell material from H2 and CO2 and ammonium is thermodynamically favourable under specifically alkaline hydrothermal vent conditions, whereby the synthesis of amino acids (not their breakdown via the Stickland reaction) delivers the strongest contribution to the overall exergonic reaction
The conditions in which amino acid fermentation is exergonic happen to be the same as those maintaining the stability of an 11-nucleotide proto-adapter. However, under the conditions of the hydrothermal vents, where amino acid fermentation is not favourable, the proto-adapters need to be much larger to allow for thermal stability of their structure. In the text I suggest that the initial steps for the establishment of the code could have occurred in cooler periods (quiet or intermittent episodes of vent flux), or away from the vents location. This would favour both amino acid breakdown and proto-adapter stability.
More generally, the notion that these amino acid pairs might somehow interact in such a way as to ultimately lead to the synthesis of acyl phosphates when they are connected to tRNA (see Figure ) is not going to work at all, because the carboxyl group is esterified in RNA-bound amino acids and that ester bond formation requires ATP hydrolysis at least in modern metabolism ("activation" of the amino acid) so there is no room for net energy conservation.
The chemical details of the RNA double strands to which amino acids are attached is only one of several possible interpretations, but one for which we can rationalise certain details. There is no evidence that stereochemistry can explain the association between the simpler amino acids and RNAs. Thus I appeal to chemical bonds catalysed by ribozymes (synthetases), which do have sequence specificity. This structural model is perhaps the most na"ıve interpretation of the correlation amongst the Stickland complementation and the anticodon complementation. Accurate structural analyses, both computational and experimental, could establish in more detail the nature of the interactions between amino acids and RNAs. But in any case, energy has to be invested in order to establish a covalent bond. In fact, as you point out, the synthetases, in both versions extant proteins and ribozymes (flexizyme), need amino acids that are activated with AMP in order to be transferred to the adapters. In an iron-sulfur world, the activation could be achieved by thioesters, but this does not avoid the energetic investment.
What is the nature of the chemical bond between the AAs and the RNA in Figure , a 2' ester?
This is an appealing possibility, considering that the synthetases (natural proteins and selected ribozymes) form this type of bond. First of all, position 3' in my molecular model (Figure ) is occupied by the phosphoester of the backbone of the RNA molecule. Thus the only free reactive group is the -OH on the 2'carbon. The activated amino acid can form this bond because the oxygen in position 2' makes a nucleophilic attack to the beta-carbon of the activated amino acid (the one linked to the phosphoester). Without the AMP, the nucleophilic attack would not proceed. Furthermore the synthetases of class II (ancestral mode) amino-acylate the carbon 3', not the 2' as in my mechanism. But on the other hand, some flexyzimes charge the adapters at carbon 2' when targeting a non-terminal nucleotide. Therefore it is not inconceivable that synthetases originally acted in this way.
The concept as suggested in the title is problematic because there have to be very large amounts of amino acids as an energy source running through this system if the Stickland reaction is really going to serve as a bioenergetic motor, having simpler compounds like H2 and CO2 as the energy source at the origin of the genetic code with amino acids doing things like catalysis and making proteins might seem more reasonable.
In fact, I agree that simpler compounds could be used as energy source. My point does not need to be regarded as contradicting this. Instead, what I claim is that using amino acids as an energy source is plausible on energetic grounds, and that this links to the early stages of assignment to proto-adapters. This does not imply that amino acids are the main energy source, or that they need to replace other sources, etc. The central point, as I explain it in more detail in the new version of the manuscript, is that prior to the usage of amino acids as catalysts, the assignment of the simpler amino acids (most of which are poor catalysts) to complementary proto-adapters can account for the earliest steps of coding.
The chemical connection -- at the level of structure that we can draw -- between amino acid pairs and short RNA intermediates en route to the code is much much weaker (if at all existent) than the logical connection that can be construed as in Figure or Figure .
We are dealing with factors that occurred long time ago, and only rarely we can have data to draw hypotheses about their origin and evolution. It is even more unusual to be able to test these ideas. My hypothesis is: (a) logically sound, (b) supported by (preliminary) evidence, (c) chemically plausible (in the sense that amino acid fermentation can sustain metabolism). The evidence is weak, but sufficient to draw some preliminary conclusions which merit further research. I accept your criticism based on chemical grounds that this structural model is unlikely, and thus a "weak connection". It does not help that we have no understanding about the ribozymatic machinery that was available at that time. In any case, these structural details are neither the centre of the hypothesis, nor ultimately relevant for evolution. Of course, selection has to act on a material basis, which might impose important constraints. Therefore, if the hypothesis presented here can explain some early steps of the code, the "weak chemical connections" will reveal the key aspects that we ignore about the ribozyme metabolism.
Title: Bioenergetic origin of the genetic code
Version: 2 Date: 4 August 2011
Reviewer: Eörs Szathmáry
Reviewer number: 2
This is an extremely original idea that is a pleasure to read. For quite some time I have been wondering what the initial advantage of amino acid usage could have been. Every such advantage is suggestive of a possible preadaptation (exaptation) that has the potential to render further evolution easier. Let me discuss some issues in steps. First, note that in an RNA world some amino acids may have been present already because they played a role in nucleotide and coenzymes biosynthesis (Gly, Ala, Val and Asp are the prime suspects).
These "Miller" amino acids are the simplest to synthesise by several means. Furthermore, the phylogenetic signal in ribosomes strongly suggest that these (amongst other small set) were preferentially used during the early days of the translational machinery. The implication is that (be it by historical contingencies, or by selective fixation) these amino acids were fundamental to the RNA world metabolism.
As Koonin suggested, selective retention of such amino acids by nucleic acid moieties to prevent them from passing through an early, leaky membrane was potentially selectively advantageous.
The selective retention is an appealing mechanism, in particular when we consider that as far as we know, there are no ribozymes associated with permeability and transport. Thus alternative processes like the one you mention, had to exist in order to deal with the loss by diffusion through a membrane. This might have been a crucial step, in that the association between amino acids and oligonucleotides was not necessarily specific. This would have created an initial diversity within which further selective processes (e.g. fermentation, or any other) could act.
The catalytic role, as suggested by the CCH hypothesis, would have to come later. I see the bioenergetic hypothesis as an attempt to build an even stronger bridge between very modest usage of amino acids and their usage as catalysts.
Certainly catalysis is the central function of amino acids, and as yourself and Kun have previously showed (2007), aspartic acid is very reactive. The problem, as discussed in the text, is that the stereochemistry has not been shown to be a determinant of the coding triplets for the simpler amino acids. The bioenergetic role of the amino acids accounts for some assignment patterns after selection has acted.
De Vladar proposes a further pattern for the vocabulary extension of amino acids, in that he points out that in several cases amino acids assigned to complementary anticodons are Stickland pairs. Well spotted! Incidentally, it is also true that they tend to be complementary in the catalytic-structural role, as the Rodins and I noticed before. The bioenergetics idea is so nice that I hope that there is something in it, but careful further thinking is badly needed (not necessarily in this pioneering article).
I regard complementarity of amino acid roles as imposing strong constraints on the establishment of the code. Whether explicable through the Stickland reaction or through a catalysis/structural function, it is still a question of developing detailed arguments and gathering further evidence. For example, we could regard the whole complex of two proto-adapters with amino acids attached to them as a more complicated version of the CCH, which allows a combinatorial range of ribozymatic functions even considering only the simpler amino acids. The appeal of the subject is that we can test these kinds of hypotheses!
A brief technical note. Although originally the coding coenzyme handles were proposed to be anticodon triplets, the modern version proposes that the advent of CCH arrived with short loops. There are two arguments to support this. First, specific recognition through Watson-Crick pairing, and ample residence time on the ribozyme to be catalytically complemented, the handles must have a fairly defined conformation, which is ensured by loops but not free triplets. Second, the tRNA evolution consideration with the Rodins also point to the appearance of anticodon and catalytic amino acids at the short loop stage.
These two arguments also apply to the Stickland hypothesis. Nucleotide triplets have a melting temperature that is too low for dimers to form in solution. However, proto-adapters of a dozen of nucleotides can be stable at physiological (although not at hydrothermal-vent) temperature. Thus it is most likely that the three dimensional conformation of the RNAs play a crucial role in the function, and consequently on the eventual establishment of the code.
My first worry is the whole context of the bioenergetic role. As the author cites, there are chemical transformations (oxidation and reduction) that happen here. Does he think that these are spontaneous, once aligned by the complementary handles, or is further catalysis needed? If yes, why not simply assumes ribozymes to bind the two reactants?
Initially, I had the hope that the structural constraints on the handles would facilitate proton and electron exchange. But a closer look revealed that the reactions are less prone to happen when covalently bonded. On the one hand this allows, as you say, ribozymes to act on a very specific moiety, but on the other hand, it suggests the need to employ strong reductant cofactors. So I presume that both cofactors (such as NADH or equivalent) and ribozymes are needed. This is my assumption.
Furthermore, what happens to the transformed reactants afterwards? If they remain linked to the same handles, then each handle would ultimately be linked to a diverse set of different intermediates of metabolism.
This is a question that depends in a very specific manner on the actual mechanism. As I can imagine it now, is that after the deamination, there is an elimination of the remaining moiety (the keto acid) by removing if from its carbonyl group and synthesising, for example, acylphosphates. The problem here is that this requires a reductase activity -which certainly is not spontaneous- and needs a strong reductant cofactor (extant Clostridia employ seleno-protein complexes). Although problematic, this is a critical step, since the deamination itself, although transferring electrons from one amino acid to another, does not release free energy.
This leads to my second worry. How does assignment (coding) arise? What is its significance? The bioenergetic role by itself would not call for coding. Does de Vladar think that assignment just happens, through stereochemistry, and gets frozen in the system for a while, without any functional role? Note that for example in the CCH hypothesis coding naturally arises through the necessity to bind the right amino acids through their handles, to catalyse the right reactions by the complemented ribozymes.
The bioenergetic role imposes constraints on the pair of adapters, but it is true that it does not itself prescribe any specific triplet to any amino acids. However, if one triplet is set, the choice of the amino acids that can be assigned to the complementary anticodon is cut by half. For example if there were only four amino acids in question at the initial stages (ala, gly, asp and val) then the choice of assigning a correct (Stickland complement) amino acid is only of one in two. However the complements of the complement via U/G pairs in the second position would constrain the amino acid which could be assigned to that codon (the one having the same Stickland role as the original amino acid). Thus the degrees of freedom are constrained. However, as shown in the text, initial random assignments of a few amino acids (2-4) could be enough to harvest limited energy. Then selecting on this energy yield results in more specific patterns. However, at some point we will need to invoke the synthetases. These are the responsible for reading the sequence of the proto-adapters and charge them accordingly. In the simulations, all assignments were equally likely, and neutral. As it has been shown by yourself and the Rodins [6
], this is not the case; the synthetases arose as a need to make specific (or pseudo-specific) assignments. Some insights come from the crystal structure of the flexizyme, showing that the amino acylation occurs by stacking the phenylalanine ring with a guanidine ring; the former is stabilised by the oxygen of the latter. This orients the carboxyl with the 3' carbon of the terminal adenine, so that the bond can be established. Therefore stereochemistry (although somewhat different than the stereochemical theory) shows that the assignments can be substantially biased.
Third, the nature of the reactions with the amino acids bound. Assume that Figures and actually do present pretty well what is imagined to have happened. There are a few difficulties. As I noted above, mere triplets may not be sufficient for such an interaction because of their weak binding to each other and also to the amino acids (selected Yarus aptamers are always bigger).
I do not assume that these are triplets; Figures and show only the triplets in order to sketch the cycles that can be formed due to the complementarity of the proto-anticodons! I have made this more explicit in the new version.
Furthermore, I am worried whether the two amino acids at opposite ends of the complementary strands would be free to interact or not. Perhaps not!
This is an excellent observation. In short, if the amino acids are attached at the end of the chains they are not close enough to interact (Figure ). Thus I had to figure out at which positions this could happen, shown in Figure . Notably, the ribo-synthetases can amino acylate at arbitrary positions of the RNA, not only at the end of it.
Setting this aside, what is the chemical nature of binding of the amino acids to the handles? I suppose De Vladar thinks (with me also) that the link was covalent at this stage. But how? Something very important may be lurking here. For the CCH hypothesis I was prompted to assume that the initial coupling must have been a stable N-link as seen in some contemporary amino acid-based modification to the anticodon loop. Later I realised that I have simply forgotten that Woese suggested the same in the late sixties already. In a letter to me he wrote that "now I would be worried about the energetics of this reaction". Yes, yes, but here may just be a crucial link! Let us think about it in the future. I hope readers do not mind that I am thinking in writing here.
The advantage of assuming an N-link is that the reactions could be related to the synthesis of nucleotides. The problem is that if linked through the nitrogen, the reactions that can happen afterwards do not change the oxidation state of the molecule; but this debatable since it would all depend on the specific cofactors and ribozymes. But a priori, the odds favour a more labile ester link, a conformation in which the oxidation-reduction reactions can happen.
Title: Amino acid fermentation at the origin of the genetic code
Versions: 2 & 3 Date: 22 August 2011/12 December 2011
Reviewer: Ádam Kun
Reviewer number: 3
Harold P. de Vladar presents an intriguing hypothesis about the origin of the genetic code. He finds correlation between the code and reactive pairs of amino acids that could be used to fuel a metabolism, as seen in extant bacteria in the genus Clostridium.
I fully agree with the statement that "we cannot possibly know what actually happened", thus there is a need to come up with plausible and testable hypotheses about the origin of life and its stages. However, I think, that amino acids were first and foremost catalytic help, and not a source of energy.
I am in total accordance with the idea that when the translation machinery was about to be established, the amino acids had foremost a catalytic (and structural) role. Critically, it is precisely this function that must have triggered the evolution of the translation machinery. Most likely, but still debatably, this function had to be implemented even before such machinery existed. Needless to say, this does not impose any constraint on the history of the role of amino acids in an ancient metabolism. Although amino acids are relatively simple, they are reactive, have a high oxidation state, and are easy to synthesise. It is therefore to be expected that they serve(d) many purposes. The ideas that I have presented, as discussed, are not inconsistent with the usage of amino acids as catalysts. In the metabolism of extant organisms, amino acids are not used only for catalysis or structural functions in the proteins; they have a variety of functions. Any combination of these (or other) functions might also have been present in earlier stages of evolution. In my opinion, we are lucky that we can find ways to rationalise any of these functions. The ideas presented here might be wrong, but at this stage, the relevant issue is that we have information enough as to state a precise hypothesis and devise ways to test it.
I base my assessment on the following lines of argument: (1) Historical contingencies: We find "fossils" of the past in our current metabolism. The RNA centric translation with a ribozyme doing the peptidyl-transfer is a fossil from the RNA world. Our coenzymes, which all harbour a nucleotide part, even though it is not the nucleotide part that does the job, are again fossils from the RNA world. The chemical nature of the bases is a contingency, there are many possible alternatives, some might even be better than these, but once evolution found this solution, it is very difficult if not impossible to change them. If amino acids had had such an important, central role in energy metabolism it would show in our current metabolism. The fact that there is an example of this in one genus of bacteria, but nowhere else in the living world does not help. If Clostridium would be the most ancient bacteria, so that this mode of energy metabolism is reserved here, but not in the other lineages, then it would be a valid argument. However, it is not the case (correct me if I'm wrong).
I agree that the idea is puzzling, and that the evidence is not the as strong as it might be. Yet, how much about the bacterial biota do we know? As you say, "once evolution found this solution, it is very difficult if not impossible to change them". But is it not exactly because of this tendency that you point out that it is in the differences that we often find crucial clues? After all, if glucose fermentation was the optimal solution, why are there other types of fermentation? Almost literally, every new bacterial species (in the broad sense) whose metabolism is surveyed unveils a new dimension of pathways. Until recently, there was a strong bias towards detecting microorganisms that use carbohydrates as an energy source. The odds are in favour of detecting alternate pathways when we discover new species. Thus I appeal to the factual cliché that absence of evidence is no evidence of absence. But at the same time, I would not wish to bury my arguments in obscurity until these are explored. I accept the criticism exposed above, particularly as a further opportunity to add or remove support to the hypothesis. Doubtlessly, once the diversity of metabolism is less biased and we have clearer ideas about the raft space of possibilities, we will be able not only to evaluate this hypothesis more accurately, but to state many others as well.
As a more direct answer to your concerns, although it is true that most known microorganisms do not preform the Stickland reaction, the proteins employed for this pathway are widespread. In particular, the reductases and the dehydrogenases perform the critical steps in the amino acid fermentation. These enzymes are not exclusive for the Stickland pathway, although in Clostridia they seem to be specialised for that function. The fact that these reactions happen allows for the possibility that analogous mechanisms existed, a possibility that was proposed at two points in the text: first, when it was suggested that the prebiotic mechanisms could be analogous to those performed by the enzymes above. Second, when it was proposed that RNAs can be artificially evolved to perform such functions, for which we would need to provide some cofactors (electron carriers, most likely NADH), as we learned from the biochemistry of the amino acid fermenters. Naturally, a detailed analysis of the molecular mechanisms may reveal molecular fossils pointing to factual and specific evidence.
Furthermore, in an RNA world setting we can safely assume that there are ample sugars around (if nothing else, the nucleotides can hydrolyse to give ribose), which are much better energy sources than amino acids (see 3rd paragraph of the "The Stickland reaction" section).
If sugars were vastly available, they could have a major energy source. If they were limiting, then a "division of labour" would be convenient, with sugars employed to synthesise nucleotides and amino acid fermentation to fuel metabolism. Incidentally, nucleoside biosynthesis might just as easily be synthesised from glycine and aspartic acid [15
]. Therefore both sugars and amino acids are needed for the synthesis of RNAs. In any case, the view of a "main energy source" might be biased, and inapplicable in a prebiotic scenario, because most compounds were scarce (which is what most prebiotic models suggest, particularly away from the hydrothermal vents). Thus harvesting energy from multiple sources would be a convenient bet-hedging strategy. The existence various carbon sources does not contradict the amino acid fermentation arguments for the origin of the code.
(2) Prebiotic synthesis of amino acids: Most of the amino acids in the genetic code have a rather low yield in the Miller experiment. I agree that the possibility of their formation is the most important outcome of the experiment, and there can be other, prebiotically plausible reaction pathways that produce amino-acids in much higher yield. But as de Vladar states "But was the relatively low abiotic yield of amino acids enough to sustain protobionts based on RNA metabolisms? The answer to that question strongly depends on the role of amino acids at the moment when the genetic code was established." (last line in the section "Abiogenesis of amino acids") Indeed, if amino acids are used to fuel metabolism, then they are consumed in the process, thus requiring a much higher yield than using them as cofactors, in which case the amino acids are not consumed.
Admittedly, the question of the yield of amino acids still stands. This is a question about geochemistry, not about evolution; it is nevertheless relevant. As I see it, what is most important about Miller-Urey synthesis, is that it shows the ease with which different amino acids are produced. The yield in this particular experiment is somewhat moot, because the conditions under which amino acids were formed are largely unknown. What is clear is that amino acids are conspicuous products in organic chemistry, and that the simplest ones (glycine, alanine, etc.) are the most common. Recall that under a range of conditions [12
] and energy sources [11
], and as well as from cosmogenic synthesis [65
], these and many others amino acids are formed. Moreover, Bada [66
] calculated that the amount of amino acids in non-biotic reservoirs is larger than in the biosphere. Thus it is specious to say that abiotic amino acids sources were not available.
However, these patterns cannot be explained by catalysis alone, because besides aspartic acid, the other simple amino acids are bad catalysts. Hence it is doubtful that a specialised ribozymatic machinery (synthetases) would had evolved in order to charge RNAs with the simple, non-catalyst amino acids; there would simply be no selective advantage for such a system, and instead only costs.
Thus while the first part of the statement "The synthesis of most catalytically important amino acids is very elaborate, and their abiotic yield is negligible (except for aspartic acid)," is true, it does not imply the second part: "a fact that necessarily postpones the catalytic functions of the amino acids to later historical stages."
Indeed, aspartic acid could have functioned as a catalyst from a very early age. My statement referred to the full repertoire of amino acids; in order to function as a robust battery of catalysts, would have to (and continue to) do so only at later stages once their biosynthesis had evolved (this is independent of whether there was a fermentation role or not for the simpler amino acids at earlier stages). The central argument is that Stickland type reactions can explain some patterns in the code. A collateral advantage could come from catalysis, setting up the pre-adaptations for a catalytic rolls (as in the CCH).
(3) Development of the RNA world: I have reservations about the historical context of the presented hypothesis. The current view of the RNA world puts the evolution of the genetic code at the end of the era.
I completely agree. The ideas that I present in the article intend to explain the first steps, the pre-adaptations, that eventually led to a code. The rearrangements (fine-tuning, if you wish) of the code to optimise catalysis, protein folds, etc. would come towards the end of the RNA era, as you point out. But the earlier steps might have occurred much long before that, say, after the iron-sulfur era, at the hydrothermal-vents.
The RNA world has already possessed cellular organisation and a rich metabolism run by many ribozymes before the advent of translation. Thus an energy producing system was already in place. In view of our current metabolism, and what was surely available to the RNA world, sugars and simpler organic compounds were the main sources of energy.
I restate: using amino acids as energy sources is not incompatible with their use as catalytic factors and thus also not with the evolution of translation. But there is little doubt that saccharides were the main source of energy at that point, in part because they are essential for (a) nucleic acid metabolism, and (b) membrane regulation in the absence of proteins, and it is therefore hard to exclude saccharides at that stage. But good as they are, the saccharides (particularly the hexoses) are too reactive, and need to be under strict control due to the risk of glycation of both lipids and amino acids (although reactions with the later might have had a role in RNA synthesis). Thus if we focus on earlier stages, less risky, but equally efficient carbon sources would be amino acids, other small organic molecules, and of course CO2.
An energy metabolism based on pairs of amino acids attached to specific adapters would require 10 (limiting the amino acid set to the 10 primordially available in Table ) highly specific enzymes to ligate the amino acids to the adaptor. Such highly specific system could only arise at a stage with an established metabolism.
I avoided going into this subject on the article on purpose, since it is a complex subject, but it is a very good observation. The enzymes that attach the amino acids to the tRNAs, namely the synthetases, are highly specific to both of their substrates: the amino acid and their cognate tRNAs. Their evolution is in itself puzzling [67
], but this specificity is what is thought to have shaped the code [6
]. But ribozymes have been artificially evolved to perform the amino acylation reaction [54
], which is a rather encouraging piece of evidence. To summarize, the picture is that the evolution of the synthetases was boosted by the emergence of new modes of amino acylation, which also allowed the inclusion of new amino acids to the code. But this had to happen with substantial variability in the proto-adapter sequences. It had to be a very specific metabolism, but probably ribozymes and cofactors were enough for these early steps, as suggested by in vitro evolution of ribo-synthetases [54
However, such an established metabolism does not generally need new energy sources.
I think that from the perspective of the ribozymes, the energy source does not matter. In fact, if the energy metabolism is canalised into a single currency (ATP), the rest of the metabolic system is blind to it, regardless of what the source is, without needing major rearrangements when new energetic sources are implemented.
Is the Stickland reaction that much efficient compared to other modes of anaerobic fermentations?
The oxidation states of the amino acids are similar to those of carbohydrates, and if efficiently reduced, they can synthesise a similar amount of ATP. For example, the fermentation of glucose leads to 102 KJ/mol ATP, lactate to 71 KJ/mol ATP, and the fermentation of two glycine molecules results in 72 KJ/mol ATP.
If amino acids were so abundant, that they would be a convenient source of energy, then why amino acids that are very abundant in the Miller experiment are not represented in the genetic code (e.g. sarcosine, N-metylalanine, etc., Miller and Urey 1959 Science 130:245)?
This concern is a central question, and perhaps the catalytic idea along with the mechanisms for modifications to the code, can be a better avenue to explain the amino acid repertoire. But it is important to consider that the Stickland reaction happens with amino acids that are not the proteic ones (and it can happen even with purines and pyrimidines). My guess is that these amino acids were either replaced by more complex ones, or simply dispensed. But the question becomes more complex when we consider the coevolution of the adapters with the amino acid repertoire. Again, we find the chicken-and-egg problem: did the adapters evolve as a response to a wider choice of amino acids, or was it the variability during the evolution of the adapters that allowed more amino acids to be "invented" and included in a proto-code? I would rather leave these questions for the future, but we should not forget them!
The simulated random codon assignments demonstrate that the rarer amino acids are better Stickland pairs, hence the increasing mean ATP yield. If so, why did Gly and Ala had remained in the code? This is only plausible if we assume that the protocells still relied on external amino acid sources, thus subpar, but abundant pairs need to be maintained as well. Which could only be true in the early days of the RNA world, however we have seen that such a metabolic system by necessity appeared late (the author also suggest that the code appeared late, se 1st paragraph of the "Relation to the abiotic and early metabolism" section). I see a contradiction here.
There is no contradiction: at the earliest times, say at the hydrothermal vents, the autotrophic metabolism would suffice. Away from the vents (or in latent periods of activity), cells would rely on external sources of energy (including amino acids). But indeed, the code (as such, for translation) was established at the times of LUCA. At this point amino acid biosynthesis and the citric acid cycle would have readily evolved. The period in between is where external sources of amino acids would be required. As you suggest: more complex amino acids were, to some extent, selectively advantageous (because of both catalysis and fermentation). My guess is that alanine and glycine remained for two reasons: first, most amino acids are electron donors, and only few are acceptors, setting a pressure to maintain glycine. Furthermore, the efficiency and antiquity of its reactivity with glycine would be too costly to simply dispose of (recall that the assumption is that there were ribozymes catalysing such reaction). Second, they are the simplest and most abundant, so quantity balances quality. As a rule of thumb, if we consider the ratio of the yield of electron donors vs. electron acceptors in the Miller experiment we find that it is 831.4: 482.51 μmol, respectively. If we consider only that of alanine and glycine, it is 790: 440 μmol. In other words, they constitute more than 90% of the total substrate for the Stickland reaction. I must point that I did not take these proportions into account in the simulations; it remains to be studied how much the distribution of the amino acids in solution affect the yield of ATP. It is indeed a pertinent and relevant question.
The author could have assessed how much better is the current genetic code compared to random codes in being accordance with the hypothesis. I know that there are novel amino acids, and also the code as we know it has been evolved to resist mutations and not to maintain Stickland pairs, but still it might be worth doing.
Consider first the ATP yield of random codes that use only alanine, glycine, aspartic acid and valine. In Higgs' four column theory
] the ancestral code assigns them to the codons NCN, NGN, NAN, and NUN, respectively. This code gives a yield of 0.073 mols of ATP. Compare this with a bootstrap using the same four amino acids, randomising the assignations: the mean yield of ATP is 1.06 ± 0.017 mols of ATP; the four column - four amino acid code gives a typical yield, as compared with the ensemble (p
> 0.9). Since there is only one determiner base in the second position, it is an efficient code, given its complexity. A Higgs code that includes glutamic acid assigning it to codons NAR does not change its fermentation yield (because there are no new adapter pairs that bear Stickland reagents), whereas the mean yield of the ensemble is increased to 0.2 ± 0.028 mols of ATP. Naturally, the order of addition makes a difference. For instance, adding leucine before glutamic acid in the four column code, increases the yield ten-fold, to 0.104 mols of ATP (the ensemble is shifted to 0.2 ± 0.29 mols of ATP). Still, all these codes lie in the lower tails of the bootstrapped distributions, even though they are improved in every expansion. One interpretation is that expansions to the code were far more frequent than code rearrangements. The standard genetic code would yield 1.04 mols of ATP, a very typical value compared with the randomised ensemble, 1.01 ± 0.358 mols of ATP. The question is until when it is significant to add amino acids under the amino acid fermentation hypothesis. Clearly at some point this all breaks down. The fitness gained by adding an amino acid to the code necessarily involves costs, for it needs new ribozymes to ferment it, and scarce amino acids would require new and highly specific synthetases to attach them to the adapters.