Although the ultimate origin of eukaryotic spliceosomal introns is not known with certainty, the current dominant hypothesis has it that they are derivatives of prokaryotic Group II self-splicing introns that gave rise both to introns and to the active RNA moieties of the spliceosome (
Lambowitz and Zimmerly 2004;
Robart and Zimmerly 2005;
Martin and Koonin 2006). Recently, this hypothesis received a strong boost from the resolved structure of a Group II intron that showed extensive similarities to the structures of spliceosomal snRNAs and the ends of introns themselves (
Toor et al. 2008).
It has been further proposed that the invasion of the ancestral eukaryotic genome by Group II introns was triggered by the mitochondrial endosymbiosis (
Martin and Koonin 2006). Under this hypothesis, the α-proteobacterial ancestor of the mitochondria contained multiple Group II introns (this is compatible with the relatively high abundance of these elements in some α-proteobacteria [
Robart and Zimmerly 2005]) that became unleashed, in part, because of recurrent release of the symbiont DNA into the host cell. This scenario of intron invasion does not depend on the nature of the organism that hosted the mitochondrial endosymbiont, that is, whether it was a typical archaeon as posited by the symbiotic hypotheses of eukaryogenesis (
Embley and Martin 2006;
Martin and Koonin 2006) or a distinct protoeukaryotic form as suggested by the archezoan hypothesis (
Kurland et al. 2006;
Poole and Penny 2007).
As suggested by the above estimates, the invasion of Group II introns led to a fairly high density of introns over the supposedly brief time interval that separated the acquisition of the mitochondrial endosymbiont from the advent of LECA. Let us assume that the intron invasion was instantaneous on the evolutionary scale, that is, rapid enough to disregard intron loss and deterioration. This assumption appears credible if the invasion was the direct consequence of endosymbiosis (
Martin and Koonin 2006). The implications for the genome architecture of the immediate predecessors of LECA seem to be striking. Group II introns are complex elements that encode a large protein containing a reverse transcriptase domain and several accessory domains; accordingly, these elements have a (nearly) uniform size of approximately 2.5 kb (
Lambowitz and Zimmerly 2004).
Thus, under the (near) instantaneous invasion scenario, that is, assuming that intron invasion occurred faster than substantial loss of intronic sequences, the median size of the introns in the (pre)LECA genome would be considerably greater than in any extant genomes, and the mean size would be greater than that in any modern forms, with the exception of mammals and some other vertebrates (). In modern bacteria, Group II introns reside, mostly, in intergenic regions (and therefore should be more properly regarded as retroelements rather than bona fide introns), presumably, because, if an intron invades a functionally important protein-coding sequence, it is rapidly weeded out by the highly efficient purifying selection that affects large prokaryotic populations (
Robart and Zimmerly 2005). By contrast, protein-coding genes of endosymbiotic organelles, namely, fungal and plant mitochondria and plant and algal chloroplasts, often carry bona fide, self-splicing introns (
Toro et al. 2007). The latter situation could be the model for the events that transpired during the mitochondrial endosymbiosis except that the original intron invasion of protein-coding regions that quickly reached the inferred high intron density must have been a much more dramatic event, a virtual genome catastrophe that could realize only under the conditions of a major population bottleneck (see Implications for Eukaryogenesis: Intron Invasion Was Accompanied by a Major Population Bottleneck That Enabled Key Eukaryotic Innovations). All known genomes of nonparasitic prokaryotes possess short intergenic regions that amount to, at most, 10–15% of the genomic sequence and a uniform mean gene size of about 1000 nucleotides (
Koonin and Wolf 2008). Assuming that the prokaryotic host of the mitochondrial endosymbiont and, accordingly, the emerging eukaryote at the earliest stages of eukaryogenesis possessed the typical prokaryotic genome architecture and that Group II introns inserted randomly into the coding and noncoding sequences, one can estimate the fraction of its genome allotted to introns. The estimated intron density of 2 introns per kilobase of the coding sequence and the mean intron length of 2.5 kb yield: 5/6 × 0.85

=

0.71, that is, more than 70% of the genome of the pre-LECA eukaryotes would be occupied by introns, by far the greatest intron content compared with modern eukaryotic genomes (). Thus, the ancient eukaryotic forms might have had literally intron-dominated genomes so that the rest of eukaryotic evolution, with some notable exceptions like the expansion of introns in mammals and, possibly, some short episodes of substantial intron gain (
Carmel et al. 2007), could be a story of intron loss and shrinkage.
It is worth noting that the above inference of the extremely high fraction of intron sequences in the genomes of primordial eukaryotes is predicated on the scenario under which the primary invasion of introns antedated the emergence of the spliceosome (
Martin and Koonin 2006). In principle, the reverse order of events is conceivable whereby the spliceosome that, clearly, preceded LECA (
Collins and Penny 2005) evolved prior to intron invasion, perhaps, performing a different function, and was available to catalyze the excision of the invading Group II introns. However, considering the apparent homology between the catalytic snRNAs of the spliceosome and the ribozyme part of Group II introns (
Toor et al. 2008), the invasion-first scenario appears distinctly more plausible.