Our study reveals diverse splicing patterns of exonized Alu elements in the human transcriptome. Most new exons originated from Alu elements probably represent non-functional splice forms that are included in the transcripts at low frequencies
[4],
[6]. However, a small subset of exonization events, in particular those associated with more ancient Alu elements, could evolve strong splicing regulatory signals to become constitutive or tissue-specific, possibly driven by positive selection. The analysis of high-density exon tiling array data across a broad range of tissues provides an efficient approach to identify such exons. Considering the incomplete coverage of Exon 1.0 arrays on human transcribed regions, and the high noise in the observed intensities of probes targeting individual exons
[57],
[58], we expect that many constitutive or tissue-specific Alu-derived exons are missed by this study. Also, while we focus on primate-specific exons derived from Alu repeats, a recent study by Alekseyenko and colleagues identified nearly 3000 human-specific exons created by de novo substitution in intronic regions during primate evolution
[59]. With improved exon microarray platforms and analysis algorithms in the future, more species-specific exons with regulatory roles are likely to be discovered.
Our data provide novel insight into the evolutionary impact of newly created exons in eukaryotic genomes. During evolution, new exons are frequently added to existing functioning genes via a variety of mechanisms, such as exonization of transposable elements, exon duplication, and de novo exonization from intronic regions
[6]. Modrek and Lee found that the birth of new exons was strongly coupled with widespread occurrence of alternative splicing in eukaryotic genes
[60]. Through pairwise comparisons of human and rodent genomes, they showed that nearly 75% of human alternatively spliced exons with low transcript inclusion levels were absent from the corresponding genomic sequence of the rodent orthologs. By contrast, the number was less than 5% for constitutive exons
[60]. This pattern was corroborated by subsequent analyses of exon creation events in vertebrates using multiple genome alignments
[48],
[59]. Based on these observations, Modrek and Lee proposed an evolutionary model that alternative splicing can facilitate the evolution of new exons – the creation of a new exon in the minor transcript isoform keeps the original gene product intact, which reduces the negative selection pressure against the new exon, allowing it to evolve towards an adaptive function
[10],
[60]. On the other hand, this evolutionary model also predicts that the vast majority of new exons found by comparative genomics analyses are non-functional evolutionary intermediates. In fact, most previous genomic studies have focused on the low transcript inclusion levels of new exons
[4],
[6],
[48],
[59],
[60]. It is unclear to what extent new exons could have produced functional and regulatory novelties. In this study, based on a large-scale splicing analysis of human tissues, we show that a number of primate-specific exons derived from Alu retrotransposons have a major impact on their genes' mRNA/protein products in a ubiquitous or tissue-specific manner. In SEPN1, the strong transcript inclusion and muscle-specificity of the Alu derived exon represents a human-specific splicing change after the divergence of humans and chimpanzees. These data suggest that some new exons may contribute to species-specific differences between humans and non-human primates.
Our study has discovered a large list of Alu-derived exons with substantial transcript inclusion levels. This exon list can be valuable for a variety of further investigations. These exons provide candidates for detailed mechanistic analyses and can be used to characterize the splicing regulatory mechanisms of Alu-derived exons. If suitable tissue samples from closely or distantly related primate species are available, it will be possible to precisely reconstruct the evolutionary events preceding the emergence of constitutive or tissue-specific Alu-derived exons. Further experimental studies will be needed to elucidate the functional significance of individual exonization events (e.g. the muscle-specific inclusion of the Alu-derived exon in SEPN1).