The timely removal of large introns from pre-mRNA poses a challenging problem to spliceosomal machinery. It has been experimentally and computationally proven that in Drosophila melanogaster
there exists a special strategy named recursive splicing for the excision of large introns. Recursive splicing occurs via selective accumulation of combined donor-acceptor splicing sites called RP-sites 
. In our research, we used the complete set of Drosophila large introns to confirm the previous computation by Burnette et al. 
—showing again that fruit fly had more than 20 times the selective accumulation of RP-sites within large introns over their complementary strands. Similarly, all other studied Insecta species (mosquito, honey bee, and beetle) as well as more distant invertebrates (sea urchin) also had an accumulation of RP-sites within their large introns that was several times more abundant when compared to their complementary strand. We also showed that the accumulation of RP-sites is in particular with respect to intron class size in fruit fly but not in human (Supplementary Figure S1
). On the other hand, all studied vertebrates, including six mammals, did not show significant accumulation of RP-sites (see Supplementary Figure S2
for a visual representation of this phenomena). Moreover, vertebrate species have overwhelmingly more large introns than the examined invertebrates. Therefore, vertebrates must mobilize another molecular mechanism for the removal of their large introns from pre-mRNA. We have hypothesized that multiple hairpins with large loops could form compact spatial structures within large introns that could help put the donor and acceptor splice sites in close proximity in order to facilitate splicing.
To test this conjecture, we examined the distribution of possible stable stem structures inside the large introns of vertebrates and invertebrates. It appeared that within Drosophila's large introns, stem structures are practically absent. The same trend was observed for the invertebrates honey bee and mosquito. On the other hand, in mammals, multiple SINE and LINE repeats (primarily SINE) located in different orientations throughout large introns drive the potential formation of hairpins with large loops. For humans there were an average of about 9.4 possible hairpins per 50 kb of the analyzed large intron sequence fragments. A vast majority of these possible stems are formed by oppositely oriented primate-specific Alu-repeats (81.7%). Other investigated mammals do not have Alu-elements, but other types of evolutionarily new SINEs specific for their taxa. These SINEs could also allow for the formation of multiple hairpin structures inside large introns. Only one of the studied vertebrates, chicken, does not have SINE elements in its genome. Instead, the chicken has very abundant and relatively short LINE elements that comprise over 60% of its repetitive elements. Thus in chicken large introns, possible stems may be formed solely by LINE repeats and not SINE repeats. One may observe, however, that the chicken has very few predicted stems, less than all studied vertebrate species and comparable to some insect species. It may be the fact that avian genomes deal with large intron splicing differently than other vertebrate species. Two facts though are clear: predicted stems for chicken are quite strong and stable (see ) and the chicken has several times fewer large introns than all studied mammalian species (see ).
Interestingly, the beetle and especially the sea urchin contain the most predicted stems of all studied invertebrates. While the sea urchin may contain the most predicted stems, even comparable to zebrafish, the majority of these predicted stems (47.5%) overlap with simple and low complexity repeats that might form hairpin structures without loops instead of the stems with large loops that we predict in mammals. Curiously, beetle's predicted stems are not strongly associated with any particular kind of repeat. We suppose that the beetle predicted stems might be formed by as yet unidentified repeats, or that they are merely a part of more complicated RNA secondary structures.
The average number of predicted long and stable stems in large introns of different mammals is 5.5 to 14 per 50 kb of large introns (see ). These stems create large loops with the average size of 12.3 to 15.2 kilobases. Relatively large loops with lengths up to 3 kilobases are characteristic for group I and group II introns containing ORFs. According to 
, about 30% of group I introns and about 25% of group II introns code proteins. These coding sequences are located inside loops that do not have specific secondary structures. The ORF-containing loops of group I introns are around 1000 nucleotides in length, while those of group II introns are even larger. The latter code proteins with an average size of 500–600 aa, according to the Group II intron database 
. Moreover, some of these proteins are significantly larger (up to 1064 aa in M.p.atpAI1 intron 
). Interestingly, these large ORF-containing loops of group I and II introns have relatively short terminal stems, usually no longer than 12 nucleotides with MFE weaker than −10 kcal/mol (P6 or P8 stems for group I; IV stems for group II introns). Multiple hairpins of these introns form complex 3D structures. These complex 3D-structures include pseudoknots and non-Watson-Crick base pairing. Presently, there are no reliable algorithms/programs to properly calculate the free energy of such structures. Therefore we do not provide such estimations. However, each individual stem of group I and II introns has folding energy at least ten times weaker than −258 kcal/mol–the average minimum free energy of the predicted stems of large introns in human (see ). Therefore, it is reasonable to hypothesize that numerous SINE and LINE repetitive elements within large mammalian introns are able to form multiple large hairpins with 100–300 nucleotide-long stems and up to a 15 kb long loops. Such structures might help to bring donor and acceptor splicing junctions of large introns closer to each other, and, thus, facilitate the effectiveness of their splicing. Indeed, recently it has been shown that even in the short introns of Saccharomyces cerevisiae secondary structures facilitate splicing by bringing together splicing elements 
Insertion of interspersed retrotransposon elements, such SINEs and LINEs, is a major force for the expansion of the genome size as a whole and intron sizes in particular 
. Accumulation of new types of retrotransposons occurs gradually and could take millions of years. After gaining several interspersed repetitive elements inserted in opposite orientations inside an intron, these elements could allow for the formation of hairpin structures with long stems to be formed by the base-pairing repetitive sequences. These hairpins would introduce a new spatial organization into intronic RNA by keeping donor and acceptor splice sites in close proximity. Such a spatial organization could become a novel mechanism for facilitating the splicing of large introns. If RP-sites were indeed already present, this competing mechanism for efficient splicing could, in turn, ease the selective constraints that preserve recursive splicing and decrease RP-site frequency to a random expectation. We therefore hypothesize that oppositely oriented interspersed repetitive elements may be playing this role in the large introns of vertebrate species. It is indeed interesting to consider that the possible problems caused by the expansion of introns due to the insertion of repetitive elements may at once be remediated by the base-pairing of the self-same elements. However, whatever forces drove or allowed the formation of such possible stem structures, their potential role in the efficient splicing of large introns poses an appealing question to molecular biologists, a question that is suggestive for future work in vitro