- All extant eukaryotes evolved from a common ancestor that already possessed an α-proteobacterial endosymbiont that gave rise to the mitochondria and their degraded relatives, hydrogenosomes and mitosomes (van der Giezen and Tovar 2005; Embley 2006).
- Eukaryotes possess 2 distinct sets of genes, one of which shows apparent phylogenetic affinity with homologs from archaea and the other one is more closely related to bacterial homologs (not all eukaryotic genes belong to these 2 sets, of course; many are of uncertain origin, and many more appear to be unique to eukaryotes). There is a clear functional divide between the “archaeal” and “bacterial” genes of eukaryotes, with the former encoding, largely, proteins involved in information processing (translation, transcription, replication, and repair) and the latter encoding proteins with “operational” functions (metabolic enzymes, components of membranes, and other cellular structures, etc.) (Esser et al. 2004; Rivera and Lake 2004). In some of the informational and operational systems, the archaeal and bacterial affinities, respectively, of eukaryotic genes are manifest qualitatively: Thus, the key proteins involved in DNA replication in archaea and eukaryotes are not homologous to the functionally analogous proteins of bacteria (Leipe et al. 1999), and conversely, some of the principal enzymes of membrane biogenesis are homologous in eukaryotes and bacteria but not in archaea (Pereto et al. 2004).
Apparently, the most parsimonious scenario of eukaryogenesis combining these 2 key facts is that the first eukaryote was an archaeal–bacterial chimera that emerged as a result of an invasion of an archaeon by an α-proteobacterium, the well-established ancestor of the mitochondria (Martin and Muller 1998
; Rivera and Lake 2004
; Martin and Koonin 2006
). However, this is by no means the only scenario of eukaryotic origins that is currently actively considered (Embley and Martin 2006
; Poole and Penny 2007b
). The main competitor is, probably, the archezoan hypothesis under which the host of the α-proteobacterial endosymbiont was not an archaeon but a primitive, obviously, amitochondrial, proto-eukaryote that already possessed the hallmarks of the eukaryotic cell, such as the endomembrane system, the nucleus, and the cytoskeleton (Kurland et al. 2006
; Poole and Penny 2007a
). The symbiotic scenarios substantially differ from the archezoan hypothesis with respect to the level of complexity that is attributed to the host of the mitochondrial endosymbiont. Under the symbiotic hypotheses, the host was a “garden variety” archaeon, with the dramatic complexification of the cellular organization being triggered by the symbiosis. In contrast, the archezoan hypothesis posits that, at least, some substantial aspects of the characteristic eukaryotic complexity (e.g., the endomembrane system) evolved prior to and independent of the symbiosis and were already in place in the organism that hosted the mitochondrion. Under the archezoan scenario, the presence of archaea-like genes in the ancestral eukaryotic gene set is, then, explained either by postulating that the proto-eukaryotic lineage was a sister group of archaea and/or by horizontal transfer of archaeal genes. The archezoan hypothesis was seriously undermined by the realization that all unicellular eukaryotes previously thought to be primitively amitochondrial actually possess degraded organelles of α-proteobacterial descent. Nevertheless, the archezoan scenario stays alive, with the proviso that the ancestral archezoan lineage had gone extinct (Poole and Penny 2007a
). In addition, more complex scenarios have been considered, with an ancient, primary symbiosis leading to the emergence of a nucleated, amitochondriate, proto-eukaryotic cell and antedating the acquisition of an α-proteobacterium that gave rise to the mitochondria. A γ-proteobacterium (Horiike et al. 2004
), a δ-proteobacterium (Moreira and Lopez-Garcia 1998
; Lopez-Garcia and Moreira 2006
), a Clostridium-like gram-positive bacterium (Karlin et al. 1999
), or a spirochaete (Margulis 1996
) have been variously proposed as bacterial counterparts of this putative primary symbiosis. The possibility also has been considered that the nucleus itself is a derived endosymbiont, a descendant of a Crenarchaeon (Lake and Rivera 1994
) or a Euryarchaeon, such as Pyrococcus (Horiike et al. 2004
) that invaded a bacterial host.
The rapidly growing collection of sequenced genomes from different domains and lineages of life provides for empirical testing of these hypotheses by phylogenetic analyses on genome-wide data. The problem of eukaryogenesis is extremely hard and complex, given the depth of the divergences involved, and arguably, has to be tackled piecemeal, by deciphering the origins of particular subsets of eukaryotic genes and signature eukaryotic functional system through thorough phylogenetic analysis. Here we address the specific evolutionary origins of those eukaryotic genes that appear to show an affinity with archaeal homologs. In particular, we asked whether the archaea-related “parent” of eukaryotes comes from within the phylogenetic span of the extant archaea, that is, originates from either Euryarchaeota or Crenarchaeota, or outside that span, perhaps, representing a distinct archaeal branch, or even a distinct domain of life. Clearly, in the first case, eukaryotes are expected to be rooted within either Crenarchaeota or Euryarchaeota in phylogenetic trees whereas, in the second case, Eukarya should branch outside of the archaeal clade.
Phylogenetic analyses and other types of evolutionary reconstructions aimed at elucidating the evolutionary relationship between archaea and eukaryotes have yielded conflicting results. Some early comparisons of ribosomal structure and phylogenetic analyses have suggested a specific affinity between eukaryotic genes and their orthologs from Crenarchaeota (dubbed eocytes on the basis of this observation) (Lake et al. 1984
; Lake 1988
; Rivera and Lake 1992
). Support for the eocyte hypothesis has been subsequently claimed from comparative analysis of ribosomal protein sequences (Vishwanath et al. 2004
) and from a novel approach to whole-genome–based phylogenetic analysis (Rivera and Lake 2004
Given these conflicting conclusions on the nature of the archaeal–eukaryotic affinity that have been reached over the years using widely different methods along with a variety of biological considerations, we were compelled to attempt an exhaustive phylogenetic analysis of eukaryotic genes of apparent archaeal origin, with a minimal set of assumptions. We do not take it for granted that genes in a lineage share a common history (Gogarten et al. 2002
; Bapteste et al. 2005
; Doolittle and Bapteste 2007
) and avoid concatenation of sequences of individual genes or a supertree-type analysis of individual trees. Instead, trees for orthologous gene sets were built separately, their topologies were assessed with several independent methods, and a post hoc census was taken.
We conclude that neither Crenarchaeota nor Euryarchaeota made the decisive contribution to the archaeal component of the ancestral eukaryotic gene set. The bulk of the eukaryotic genes with an apparent archaeal affinity seem to originate from a distinct archaeal lineage that branched off the trunk of the archaeal tree prior to the radiation of Crenarchaeota and Euryarchaeota. A limited amount of horizontal gene transfer (HGT) might have led to the acquisition of the few eukaryotic genes that do show Crenarchaeal and Euryarchaeal affinities.