Chromatin organization affects alternative splicing and previous studies have shown that exons have increased nucleosome occupancy compared with their flanking introns. To determine whether alternative splicing affects chromatin organization we developed a system in which the alternative splicing pattern switched from inclusion to skipping as a function of time. Changes in nucleosome occupancy were correlated with the change in the splicing pattern. Surprisingly, strengthening of the 5′ splice site or strengthening the base pairing of U1 snRNA with an internal exon abrogated the skipping of the internal exons and also affected chromatin organization. Over-expression of splicing regulatory proteins also affected the splicing pattern and changed nucleosome occupancy. A specific splicing inhibitor was used to show that splicing impacts nucleosome organization endogenously. The effect of splicing on the chromatin required a functional U1 snRNA base pairing with the 5′ splice site, but U1 pairing was not essential for U1 snRNA enhancement of transcription. Overall, these results suggest that splicing can affect chromatin organization.
To gain global insights into the role of the well-known repressive splicing regulator PTB we analyzed the consequences of PTB knockdown in HeLa cells using high-density oligonucleotide splice-sensitive microarrays. The major class of identified PTB-regulated splicing event was PTB-repressed cassette exons, but there was also a substantial number of PTB-activated splicing events. PTB repressed and activated exons showed a distinct arrangement of motifs with pyrimidine-rich motif enrichment within and upstream of repressed exons, but downstream of activated exons. The N-terminal half of PTB was sufficient to activate splicing when recruited downstream of a PTB-activated exon. Moreover, insertion of an upstream pyrimidine tract was sufficient to convert a PTB-activated to a PTB-repressed exon. Our results demonstrate that PTB, an archetypal splicing repressor, has variable splicing activity that predictably depends upon its binding location with respect to target exons.
Since the emergence of next-generation sequencing (NGS) technologies, great effort has been put into the development of tools for analysis of the short reads. In parallel, knowledge is increasing regarding biases inherent in these technologies. Here we discuss four different biases we encountered while analyzing various Illumina datasets. These biases are due to both biological and statistical effects that in particular affect comparisons between different genomic regions. Specifically, we encountered biases pertaining to the distributions of nucleotides across sequencing cycles, to mappability, to contamination of pre-mRNA with mRNA, and to non-uniform hydrolysis of RNA. Most of these biases are not specific to one analyzed dataset, but are present across a variety of datasets and within a variety of genomic contexts. Importantly, some of these biases correlated in a highly significant manner with biological features, including transcript length, gene expression levels, conservation levels, and exon-intron architecture, misleadingly increasing the credibility of results due to them. We also demonstrate the relevance of these biases in the context of analyzing an NGS dataset mapping transcriptionally engaged RNA polymerase II (RNAPII) in the context of exon-intron architecture, and show that elimination of these biases is crucial for avoiding erroneous interpretation of the data. Collectively, our results highlight several important pitfalls, challenges and approaches in the analysis of NGS reads.
Familial Dysautonomia (FD) is an autosomal recessive congenital neuropathy that results from abnormal development and progressive degeneration of the sensory and autonomic nervous system. The mutation observed in almost all FD patients is a point mutation at position 6 of intron 20 of the IKBKAP gene; this gene encodes the IκB kinase complex-associated protein (IKAP). The mutation results in a tissue-specific splicing defect: Exon 20 is skipped, leading to reduced IKAP protein expression. Here we show that phosphatidylserine (PS), an FDA-approved food supplement, increased IKAP mRNA levels in cells derived from FD patients. Long-term treatment with PS led to a significant increase in IKAP protein levels in these cells. A conjugate of PS and an omega-3 fatty acid also increased IKAP mRNA levels. Furthermore, PS treatment released FD cells from cell cycle arrest and up-regulated a significant number of genes involved in cell cycle regulation. Our results suggest that PS has potential for use as a therapeutic agent for FD. Understanding its mechanism of action may reveal the mechanism underlying the FD disease.
Transposable elements (TEs) have played an important role in the diversification and enrichment of mammalian transcriptomes through various mechanisms such as exonization and intronization (the birth of new exons/introns from previously intronic/exonic sequences, respectively), and insertion into first and last exons. However, no extensive analysis has compared the effects of TEs on the transcriptomes of mammals, non-mammalian vertebrates and invertebrates.
We analyzed the influence of TEs on the transcriptomes of five species, three invertebrates and two non-mammalian vertebrates. Compared to previously analyzed mammals, there were lower levels of TE introduction into introns, significantly lower numbers of exonizations originating from TEs and a lower percentage of TE insertion within the first and last exons. Although the transcriptomes of vertebrates exhibit significant levels of exonization of TEs, only anecdotal cases were found in invertebrates. In vertebrates, as in mammals, the exonized TEs are mostly alternatively spliced, indicating that selective pressure maintains the original mRNA product generated from such genes.
Exonization of TEs is widespread in mammals, less so in non-mammalian vertebrates, and very low in invertebrates. We assume that the exonization process depends on the length of introns. Vertebrates, unlike invertebrates, are characterized by long introns and short internal exons. Our results suggest that there is a direct link between the length of introns and exonization of TEs and that this process became more prevalent following the appearance of mammals.
Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.
Transposable elements (TEs) have contributed a wide range of functional sequences to their host genomes. A recent paper in BMC Molecular Biology discusses the creation of new transcripts by transposable element insertion upstream of retrocopies and the involvement of such insertions in tissue-specific post-transcriptional regulation.
Regulation of splicing in eukaryotes occurs through the coordinated action of multiple splicing factors. Exons and introns contain numerous putative binding sites for splicing regulatory proteins. Regulation of splicing is presumably achieved by the combinatorial output of the binding of splicing factors to the corresponding binding sites. Although putative regulatory sites often overlap, no extensive study has examined whether overlapping regulatory sequences provide yet another dimension to splicing regulation. Here we analyzed experimentally-identified splicing regulatory sequences using a computational method based on the natural distribution of nucleotides and splicing regulatory sequences. We uncovered positive and negative interplay between overlapping regulatory sequences. Examination of these overlapping motifs revealed a unique spatial distribution, especially near splice donor sites of exons with weak splice donor sites. The positively selected overlapping splicing regulatory motifs were highly conserved among different species, implying functionality. Overall, these results suggest that overlap of two splicing regulatory binding sites is an evolutionary conserved widespread mechanism of splicing regulation. Finally, over-abundant motif overlaps were experimentally tested in a reporting minigene revealing that overlaps may facilitate a mode of splicing that did not occur in the presence of only one of the two regulatory sequences that comprise it.
Throughout evolution, eukaryotic genomes have been invaded by transposable elements (TEs). Little is known about the factors leading to genomic proliferation of TEs, their preferred integration sites and the molecular mechanisms underlying their insertion. We analyzed hundreds of thousands nested TEs in the human genome, i.e. insertions of TEs into existing ones. We first discovered that most TEs insert within specific ‘hotspots’ along the targeted TE. In particular, retrotransposed Alu elements contain a non-canonical single nucleotide hotspot for insertion of other Alu sequences. We next devised a method for identification of integration sequence motifs of inserted TEs that are conserved within the targeted TEs. This method revealed novel sequences motifs characterizing insertions of various important TE families: Alu, hAT, ERV1 and MaLR. Finally, we performed a global assessment to determine the extent to which young TEs tend to nest within older transposed elements and identified a 4-fold higher tendency of TEs to insert into existing TEs than to insert within non-TE intergenic regions. Our analysis demonstrates that TEs are highly biased to insert within certain TEs, in specific orientations and within specific targeted TE positions. TE nesting events also reveal new characteristics of the molecular mechanisms underlying transposition.
More than 5% of alternatively spliced internal exons in the human genome are derived from Alu elements in a process termed exonization. Alus are comprised of two homologous arms separated by an internal polypyrimidine tract (PPT). In most exonizations, splice sites are selected from within the same arm. We hypothesized that the internal PPT may prevent selection of a splice site further downstream. Here, we demonstrate that this PPT enhanced the selection of an upstream 5′ splice site (5′ss), even in the presence of a stronger 5′ss downstream. Deletion of this PPT shifted selection to the stronger downstream 5′ss. This enhancing effect depended on the strength of the downstream 5′ss, on the efficiency of base-pairing to U1 snRNA, and on the length of the PPT. This effect of the PPT was mediated by the binding of TIA proteins and was dependent on the distance between the PPT and the upstream 5′ss. A wide-scale evolutionary analysis of introns across 22 eukaryotes revealed an enrichment in PPTs within ∼20 nt downstream of the 5′ss. For most metazoans, the strength of the 5′ss inversely correlated with the presence of a downstream PPT, indicative of the functional role of the PPT. Finally, we found that the proteins that mediate this effect, TIA and U1C, and in particular their functional domains, are highly conserved across evolution. Overall, these findings expand our understanding of the role of TIA1/TIAR proteins in enhancing recognition of exons, in general, and Alu exons, in particular.
Human genes are composed of functional regions, termed exons, separated by non-functional regions, termed introns. Intronic sequences may gradually accumulate mutations and subsequently become recognized by the splicing machinery as exons, a process termed exonization. Alu elements are prone to undergo exonization: more than 5% of alternatively spliced internal exons in the human genome originate from Alu elements. A typical Alu element is ∼300 nucleotides long, consisting of two arms separated by a polypyrimdine tract (PPT). Interestingly, in most cases, exonization occurs almost exclusively within either the right arm or the left, not both. Here we found that the PPT between the two arms serves as a binding site for TIA proteins and prevents the exon selection process from expanding into downstream regions. To obtain a wider overview of TIA function, we performed a cross-evolutionary analysis within 22 eukaryotes of this protein and of U1C, a protein known to interact with it, and found that functional regions of both these proteins were highly conserved. These findings highlight the pivotal role of TIA proteins in 5′ splice-site selection of Alu exons and exon recognition in general.
Exons are typically only 140 nt in length and are surrounded by intronic oceans that are thousands of nucleotides long. Four core splicing signals, aided by splicing-regulatory sequences (SRSs), direct the splicing machinery to the exon/intron junctions. Many different algorithms have been developed to identify and score the four splicing signals and thousands of putative SRSs have been identified, both computationally and experimentally. Here we describe SROOGLE, a webserver that makes splicing signal sequence and scoring data available to the biologist in an integrated, visual, easily interpretable, and user-friendly format. SROOGLE's input consists of the sequence of an exon and flanking introns. The graphic browser output displays the four core splicing signals with scores based on nine different algorithms and highlights sequences belonging to 13 different groups of SRSs. The interface also offers the ability to examine the effect of point mutations at any given position, as well a range of additional metrics and statistical measures regarding each potential signal. SROOGLE is available at http://sroogle.tau.ac.il, and may also be downloaded as a desktop version.
Despite decades of research, the question of how the mRNA splicing machinery precisely identifies short exonic islands within the vast intronic oceans remains to a large extent obscure. In this study, we analyzed Alu exonization events, aiming to understand the requirements for correct selection of exons. Comparison of exonizing Alus to their non-exonizing counterparts is informative because Alus in these two groups have retained high sequence similarity but are perceived differently by the splicing machinery. We identified and characterized numerous features used by the splicing machinery to discriminate between Alu exons and their non-exonizing counterparts. Of these, the most novel is secondary structure: Alu exons in general and their 5′ splice sites (5′ss) in particular are characterized by decreased stability of local secondary structures with respect to their non-exonizing counterparts. We detected numerous further differences between Alu exons and their non-exonizing counterparts, among others in terms of exon–intron architecture and strength of splicing signals, enhancers, and silencers. Support vector machine analysis revealed that these features allow a high level of discrimination (AUC = 0.91) between exonizing and non-exonizing Alus. Moreover, the computationally derived probabilities of exonization significantly correlated with the biological inclusion level of the Alu exons, and the model could also be extended to general datasets of constitutive and alternative exons. This indicates that the features detected and explored in this study provide the basis not only for precise exon selection but also for the fine-tuned regulation thereof, manifested in cases of alternative splicing.
A typical human gene consists of 9 exons around 150 nucleotides in length, separated by introns that are ∼3,000 nucleotides long. The challenge of the splicing machinery is to precisely identify and ligate the exons, while removing the introns. We aimed to understand how the splicing machinery meets this momentous challenge, based on Alu exonization events. Alus are transposable elements, of which approximately one million copies exist in the human genome, a large portion of which within introns. Throughout evolution, some intronic Alus accumulated mutations and became recognized by the splicing machinery as exons, a process termed exonization. Such Alus remain highly similar to their non-exonizing counterparts but are perceived as different by the splicing machinery. By comparing exonizing Alus to their non-exonizing counterparts, we were able to identify numerous features in which they differ and which presumably lead to the recognition only of the former by the splicing machinery. Our findings reveal insights regarding the role of local RNA secondary structures, exon–intron architecture constraints, and splicing regulatory signals. We integrated these features in a computational model, which was able to successfully mimic the function of the splicing machinery and discriminate between true Alu exons and their intronic counterparts, highlighting the functional importance of these features.
Transposable elements may acquire unrelated gene fragments into their sequences in a process called transduplication. Transduplication of protein-coding genes is common in plants, but is unknown of in animals. Here, we report that the Turmoil-1 transposable element in C. elegans has incorporated two protein-coding sequences into its inverted terminal repeat (ITR) sequences. The ITRs of Turmoil-1 contain a conserved RNA recognition motif (RRM) that originated from the rsp-2 gene and a fragment from the protein-coding region of the cpg-3 gene. We further report that an open reading frame specific to C. elegans may have been created as a result of a Turmoil-1 insertion. Mutations at the 5' splice site of this open reading frame may have reactivated the transduplicated RRM motif.
This article was reviewed by Dan Graur and William Martin. For the full reviews, please go to the Reviewers' Reports section.
Examination of the human transcriptome reveals higher levels of RNA editing than in any other organism tested to date. This is indicative of extensive double-stranded RNA (dsRNA) formation within the human transcriptome. Most of the editing sites are located in the primate-specific retrotransposed element called Alu. A large fraction of Alus are found in intronic sequences, implying extensive Alu-Alu dsRNA formation in mRNA precursors. Yet, the effect of these intronic Alus on splicing of the flanking exons is largely unknown. Here, we show that more Alus flank alternatively spliced exons than constitutively spliced ones; this is especially notable for those exons that have changed their mode of splicing from constitutive to alternative during human evolution. This implies that Alu insertions may change the mode of splicing of the flanking exons. Indeed, we demonstrate experimentally that two Alu elements that were inserted into an intron in opposite orientation undergo base-pairing, as evident by RNA editing, and affect the splicing patterns of a downstream exon, shifting it from constitutive to alternative. Our results indicate the importance of intronic Alus in influencing the splicing of flanking exons, further emphasizing the role of Alus in shaping of the human transcriptome.
The human genome is crowded with over one million copies of primate-specific retrotransposed elements, termed Alu. A large fraction of Alu elements are located within intronic sequences. The human transcriptome undergoes extensive RNA editing (A-to-I), to higher levels than any other tested organism. RNA editing requires the formation of a double-stranded RNA structure in order to occur. Over 90% of the editing sites in the human transcriptome are found within Alu sequences. Thus, the high level of RNA editing is indicative of extensive secondary structure formation in mRNA precursors driven by intronic Alu-Alu base pairing. Splicing is a molecular mechanism in which introns are removed from an mRNA precursor and exons are ligated to form a mature mRNA. Here, we show that Alu insertions into introns can affect the splicing of the flanking exons. We experimentally demonstrate that two Alu elements that were inserted into the same intron in opposite orientation undergo base-pairing, and consequently shift the splicing pattern of the downstream exon from constitutive inclusion in all mature mRNA molecules to alternative skipping. This emphasizes the impact of Alu elements on the primate-specific transcriptome evolution, as such events can generate new isoforms that might acquire novel functions.
Exonization of Alu elements creates primate-specific genomic diversity. Here we combine bioinformatic and experimental methodologies to reconstruct the molecular changes leading to exon selection. Our analyses revealed an intricate network involved in Alu exonization. A typical Alu element contains multiple sites with the potential to serve as 5′ splice sites (5′ss). First, we demonstrated the role of 5′ss strength in controlling exonization events. Second, we found that a cryptic 5′ss enhances the selection of a more upstream site and demonstrate that this is mediated by binding of U1 snRNA to the cryptic splice site, challenging the traditional role attributed to U1 snRNA of binding the 5′ss only. Third, we used a simple algorithm to identify specific sequences that determine splice site selection within specific Alu exons. Finally, by inserting identical exons within different sequences, we demonstrated the importance of flanking genomic sequences in determining whether an Alu exon will undergo exonization. Overall, our results demonstrate the complex interplay between at least four interacting layers that affect Alu exonization. These results shed light on the mechanism through which Alu elements enrich the primate transcriptome and allow a better understanding of the exonization process in general.
Alus, primate-specific retroelements, are the most abundant repetitive elements in the human genome. They are composed of two related but distinct monomers, left and right arms. Intronic Alu elements may acquire mutations that generate functional splice sites, a process called exonization. Most exonizations occur in right arms of antisense Alu elements, and are alternatively spliced. Here we show that without the left arm, exonization of the right arm shifts from alternative to constitutive splicing. This eliminates the evolutionary conserved isoform and may thus be selected against. We further show that insertion of the left arm downstream of a constitutively spliced non-Alu exon shifts splicing from constitutive to alternative. Although the two arms are highly similar, the left arm is characterized by weaker splicing signals and lower exonic splicing regulatory (ESR) densities. Mutations that improve these potential splice signals activate exonization and shift splicing from the right to the left arm. Collaboration between two or more putative splice signals renders the intronic left arm with a pseudo-exon function. Thus, the dimeric form of the Alu element fortuitously provides it with an evolutionary advantage, allowing enrichment of the primate transcriptome without compromising its original repertoire.
Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes.
Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells.
The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.
Alternative cassette exons are known to originate from two processes—exonization of intronic sequences and exon shuffling. Herein, we suggest an additional mechanism by which constitutively spliced exons become alternative cassette exons during evolution. We compiled a dataset of orthologous exons from human and mouse that are constitutively spliced in one species but alternatively spliced in the other. Examination of these exons suggests that the common ancestors were constitutively spliced. We show that relaxation of the 5′ splice site during evolution is one of the molecular mechanisms by which exons shift from constitutive to alternative splicing. This shift is associated with the fixation of exonic splicing regulatory sequences (ESRs) that are essential for exon definition and control the inclusion level only after the transition to alternative splicing. The effect of each ESR on splicing and the combinatorial effects between two ESRs are conserved from fish to human. Our results uncover an evolutionary pathway that increases transcriptome diversity by shifting exons from constitutive to alternative splicing.
Alternative splicing is believed to play a major role in the creation of transcriptomic diversification leading to higher order of organismal complexity, especially in mammals. As much as 80% of human genes generate more than one type of mRNA by alternative splicing. Thus, alternative splicing can bridge the low number of protein coding genes (∼24,500) and the total number of proteins generated in the human proteome (∼90,000). The correlation between the higher order of phenotypic diversity and alternative splicing was recently demonstrated and thus the origin of alternative splicing is of great interest. There are currently two models regarding the origin of alternatively spliced exons—exonization of intronic sequences and exon shuffling. According to these two mechanisms, a protein-coding gene was first established and only then a new alternative exon appeared within it or was added to the gene. Our current study provides evidences for a new mechanism indicating that during evolution constitutively spliced exons became alternatively spliced. Large-scale bioinformatic analyses reveal the magnitude of this process and experimental validation systems provide insights into its mechanisms.
Transposed elements (TEs) are known to affect transcriptomes, because either new exons are generated from intronic transposed elements (this is called exonization), or the element inserts into the exon, leading to a new transcript. Several examples in the literature show that isoforms generated by an exonization are specific to a certain tissue (for example the heart muscle) or inflict a disease. Thus, exonizations can have negative effects for the transcriptome of an organism.
As we aimed at detecting other tissue- or tumor-specific isoforms in human and mouse genomes which were generated through exonization of a transposed element, we designed the automated analysis pipeline SERpredict (SER = Specific Exonized Retroelement) making use of Bayesian Statistics. With this pipeline, we found several genes in which a transposed element formed a tissue- or tumor-specific isoform.
Our results show that SERpredict produces relevant results, demonstrating the importance of transposed elements in shaping both the human and the mouse transcriptomes. The effect of transposed elements on the human transcriptome is several times higher than the effect on the mouse transcriptome, due to the contribution of the primate-specific Alu elements.
Transposed elements (TEs) are mobile genetic sequences. During the evolution of eukaryotes TEs were inserted into active protein-coding genes, affecting gene structure, expression and splicing patterns, and protein sequences. Genomic insertions of TEs also led to creation and expression of new functional non-coding RNAs such as microRNAs. We have constructed the TranspoGene database, which covers TEs located inside protein-coding genes of seven species: human, mouse, chicken, zebrafish, fruit fly, nematode and sea squirt. TEs were classified according to location within the gene: proximal promoter TEs, exonized TEs (insertion within an intron that led to exon creation), exonic TEs (insertion into an existing exon) or intronic TEs. TranspoGene contains information regarding specific type and family of the TEs, genomic and mRNA location, sequence, supporting transcript accession and alignment to the TE consensus sequence. The database also contains host gene specific data: gene name, genomic location, Swiss-Prot and RefSeq accessions, diseases associated with the gene and splicing pattern. In addition, we created microTranspoGene: a database of human, mouse, zebrafish and nematode TE-derived microRNAs. The TranspoGene and microTranspoGene databases can be used by researchers interested in the effect of TE insertion on the eukaryotic transcriptome. Publicly available query interfaces to TranspoGene and microTranspoGene are available at http://transpogene.tau.ac.il/ and http://microtranspogene.tau.ac.il, respectively. The entire database can be downloaded as flat files.
Analysis of transposed elements in the human and mouse genomes reveals many effects on the transcriptomes, including a higher level of exonization of Alu elements than other elements.
Transposed elements (TEs) have a substantial impact on mammalian evolution and are involved in numerous genetic diseases. We compared the impact of TEs on the human transcriptome and the mouse transcriptome.
We compiled a dataset of all TEs in the human and mouse genomes, identifying 3,932,058 and 3,122,416 TEs, respectively. We than extracted TEs located within human and mouse genes and, surprisingly, we found that 60% of TEs in both human and mouse are located in intronic sequences, even though introns comprise only 24% of the human genome. All TE families in both human and mouse can exonize. TE families that are shared between human and mouse exhibit the same percentage of TE exonization in the two species, but the exonization level of Alu, a primate-specific retroelement, is significantly greater than that of other TEs within the human genome, leading to a higher level of TE exonization in human than in mouse (1,824 exons compared with 506 exons, respectively). We detected a primate-specific mechanism for intron gain, in which Alu insertion into an exon creates a new intron located in the 3' untranslated region (termed 'intronization'). Finally, the insertion of TEs into the first and last exons of a gene is more frequent in human than in mouse, leading to longer exons in human.
Our findings reveal many effects of TEs on these two transcriptomes. These effects are substantially greater in human than in mouse, which is due to the presence of Alu elements in human.
Alternative 3′ and 5′ splice site (ss) events constitute a significant part of all alternative splicing events. These events were also found to be related to several aberrant splicing diseases. However, only few of the characteristics that distinguish these events from alternative cassette exons are known currently. In this study, we compared the characteristics of constitutive exons, alternative cassette exons, and alternative 3′ss and 5′ss exons. The results revealed that alternative 3′ss and 5′ss exons are an intermediate state between constitutive and alternative cassette exons, where the constitutive side resembles constitutive exons, and the alternative side resembles alternative cassette exons. The results also show that alternative 3′ss and 5′ss exons exhibit low levels of symmetry (frame-preserving), similar to constitutive exons, whereas the sequence between the two alternative splice sites shows high symmetry levels, similar to alternative cassette exons. In addition, flanking intronic conservation analysis revealed that exons whose alternative splice sites are at least nine nucleotides apart show a high conservation level, indicating intronic participation in the regulation of their splicing, whereas exons whose alternative splice sites are fewer than nine nucleotides apart show a low conservation level. Further examination of these exons, spanning seven vertebrate species, suggests an evolutionary model in which the alternative state is a derivative of an ancestral constitutive exon, where a mutation inside the exon or along the flanking intron resulted in the creation of a new splice site that competes with the original one, leading to alternative splice site selection. This model was validated experimentally on four exons, showing that they indeed originated from constitutive exons that acquired a new competing splice site during evolution.
Alternative splicing is the mechanism that is responsible for the creation of multiple mRNA products from a single gene. It is considered a key player in genomic complexity achievement. Alternative 3′ and 5′ splicing events in which part of the exon is alternatively included or excluded in the mRNA constitute a significant part of all alternative splicing events, and yet little is known regarding their regulation mechanism and the evolutionary background that led to their creation. We show that alternative 3′ and 5′ splice site exons resemble constitutive exons. However, their alternative sequence resembles alternative cassette exons. Comparative genomics spanning seven vertebrate species suggests an evolutionary model in which the alternative state is a derivative of an ancestral constitutive exon, where a mutation inside the exon or along the flanking intron resulted in the creation of a new splice site that competes with the original one, leading to alternative splice site selection. This model was validated experimentally, showing that during evolution mutations shifted constitutive exons to undergo alternative 3′ and 5′ splicing.
A primate-specific exon is found to be dependent on RNA editing for its exonization.
Alu retroelements are specific to primates and abundant in the human genome. Through mutations that create functional splice sites within intronic Alus, these elements can become new exons in a process denoted exonization. It was recently shown that Alu elements are also heavily changed by RNA editing in the human genome.
Here we show that the human nuclear prelamin A recognition factor contains a primate-specific Alu-exon that exclusively depends on RNA editing for its exonization. We demonstrate that RNA editing regulates the exonization in a tissue-dependent manner, through both the creation of a functional AG 3' splice site, and alteration of functional exonic splicing enhancers within the exon. Furthermore, a premature stop codon within the Alu-exon is eliminated by an exceptionally efficient RNA editing event. The sequence surrounding this editing site is important not only for editing of that site but also for editing in other neighboring sites as well.
Our results show that the abundant RNA editing of Alu sequences can be recruited as a mechanism supporting the birth of new exons in the human genome.
Alternative splicing increases transcriptome and proteome diversification. Previous analyses aiming at comparing the rate of alternative splicing between different organisms provided contradicting results. These contradicting results were attributed to the fact that both analyses were dependent on the expressed sequence tag (EST) coverage, which varies greatly between the tested organisms. In this study we compare the level of alternative splicing among eight different organisms. By employing an EST independent approach we reveal that the percentage of genes and exons undergoing alternative splicing is higher in vertebrates compared with invertebrates. We also find that alternative exons of the skipping type are flanked by longer introns compared to constitutive ones, whereas alternative 5′ and 3′ splice sites events are generally not. In addition, although the regulation of alternative splicing and sizes of introns and exons have changed during metazoan evolution, intron retention remained the rarest type of alternative splicing, whereas exon skipping is more prevalent and exhibits a slight increase, from invertebrates to vertebrates. The difference in the level of alternative splicing suggests that alternative splicing may contribute greatly to the mammal higher level of phenotypic complexity, and that accumulation of introns confers an evolutionary advantage as it allows increasing the number of alternative splicing forms.