All known life shares an ancestor at the root of the universal phylogenetic tree (1
). This Last Universal Common Ancestor (LUCA) is a construct that may represent a single organism (1
) or may represent populations of organisms capable of sharing large amounts of genetic information through horizontal gene transfer (3
). Either way, organisms at the time of LUCA possessed many of the fundamental features present in modern organisms and likely exhibited a level of sophistication comparable with modern Bacteria or Archaea (5
). Over the last decade, a number of studies have used bioinformatics databases to characterize the minimal set of features present in LUCA. The subjects of these surveys include gene families, protein architectures, protein domains and motifs and enzymatic functions.
Most of these studies identify a minimal set of hundreds of traits present in LUCA, which also most likely had a DNA genome, a cell membrane and a complete translation system. This complexity implies that a significant amount of evolutionary change must have taken place between the first life forms and LUCA. Multiple lines of evidence suggest that the earliest genetically encoded metabolism was produced by an RNA-only system in which RNA genes encoded ribozyme catalysts (6
). Still more evidence suggests that protein translation arose from this RNA-only system (7
) and that the DNA genome subsequently arose from the RNA-protein system, possibly just prior to the divergence of LUCA into the three domains of life (9
). The capacity of an RNA-only system to support life has been studied by surveying the roles of naturally occurring ribozymes and synthesizing new ribozymes in vitro
that have functions relevant to early life (11
Non-genetically encoded catalysts, such as metal ions (12
) or mineral surfaces (13
) may also have played an important role in the production of large biomolecules both before and during the RNA-only era. Modern enzymes often use both organic and inorganic cofactors to impart catalysis. Some inorganic cofactors might reflect a pre-protein stage in which the reactions were catalysed by analogous ions and minerals (15
). Similarly, nucleotide-derived cofactors may reflect a ribozyme precursor to modern protein enzymes that catalysed an analogous reaction.
Here we describe LUCApedia, which integrates these three lines of research into a unified framework provided by several well-established repositories of protein data. Users may query the database web server for a single protein in order to collect evidence of its antiquity from a broad range of studies. Downloadable database files may be used to evaluate the earliest components of modern pathways and to compare the antiquity of similar pathways to one another in an automated fashion. Users may also test the accuracy of previous studies and hypotheses implemented in the database by corroborating one of its data sets against the rest.