Search tips
Search criteria

Results 1-25 (920942)

Clipboard (0)

Related Articles

1.  Alternative Splicing of RNA Triplets Is Often Regulated and Accelerates Proteome Evolution 
PLoS Biology  2012;10(1):e1001229.
Inclusion or exclusion of single codons at the splice acceptor site of mammalian genes is regulated in a tissue-specific manner, is strongly conserved, and is associated with local accelerated protein evolution.
Thousands of human genes contain introns ending in NAGNAG (N any nucleotide), where both NAGs can function as 3′ splice sites, yielding isoforms that differ by inclusion/exclusion of three bases. However, few models exist for how such splicing might be regulated, and some studies have concluded that NAGNAG splicing is purely stochastic and nonfunctional. Here, we used deep RNA-Seq data from 16 human and eight mouse tissues to analyze the regulation and evolution of NAGNAG splicing. Using both biological and technical replicates to estimate false discovery rates, we estimate that at least 25% of alternatively spliced NAGNAGs undergo tissue-specific regulation in mammals, and alternative splicing of strongly tissue-specific NAGNAGs was 10 times as likely to be conserved between species as was splicing of non-tissue-specific events, implying selective maintenance. Preferential use of the distal NAG was associated with distinct sequence features, including a more distal location of the branch point and presence of a pyrimidine immediately before the first NAG, and alteration of these features in a splicing reporter shifted splicing away from the distal site. Strikingly, alignments of orthologous exons revealed a ∼15-fold increase in the frequency of three base pair gaps at 3′ splice sites relative to nearby exon positions in both mammals and in Drosophila. Alternative splicing of NAGNAGs in human was associated with dramatically increased frequency of exon length changes at orthologous exon boundaries in rodents, and a model involving point mutations that create, destroy, or alter NAGNAGs can explain both the increased frequency and biased codon composition of gained/lost sequence observed at the beginnings of exons. This study shows that NAGNAG alternative splicing generates widespread differences between the proteomes of mammalian tissues, and suggests that the evolutionary trajectories of mammalian proteins are strongly biased by the locations and phases of the introns that interrupt coding sequences.
Author Summary
In order to translate a gene into protein, all of the non-coding regions (introns) need to be removed from the transcript and the coding regions (exons) stitched back together to make an mRNA. Most human genes are alternatively spliced, allowing the selection of different combinations of exons to produce multiple distinct mRNAs and proteins. Many types of alternative splicing are known to play crucial roles in biological processes including cell fate determination, tumor metabolism, and apoptosis. In this study, we investigated a form of alternative splicing in which competing adjacent 3′ splice sites (or splice acceptor sites) generate mRNAs differing by just an RNA triplet, the size of a single codon. This mode of alternative splicing, known as NAGNAG splicing, affects thousands of human genes and has been known for a decade, but its potential regulation, physiological importance, and conservation across species have been disputed. Using high-throughput sequencing of cDNA (“RNA-Seq”) from human and mouse tissues, we found that single-codon splicing often shows strong tissue specificity. Regulated NAGNAG alternative splice sites are selectively conserved between human and mouse genes, suggesting that they are important for organismal fitness. We identified features of the competing splice sites that influence NAGNAG splicing, and validated their effects in cultured cells. Furthermore, we found that this mode of splicing is associated with accelerated and highly biased protein evolution at exon boundaries. Taken together, our analyses demonstrate that the inclusion or exclusion of RNA triplets at exon boundaries can be effectively regulated by the splicing machinery, and highlight an unexpected connection between RNA processing and protein evolution.
PMCID: PMC3250501  PMID: 22235189
2.  A phylogenetic study of Drosophila splicing assembly chaperone RNP-4F associated U4-/U6-snRNA secondary structure 
The rnp-4f gene in Drosophila melanogaster encodes nuclear protein RNP-4F. This encoded protein is represented by homologs in other eukaryotic species, where it has been shown to function as an intron splicing assembly factor. Here, RNP-4F is believed to initially bind to a recognition sequence on U6-snRNA, serving as a chaperone to facilitate its association with U4-snRNA by intermolecular hydrogen bonding. RNA conformations are a key factor in spliceosome function, so that elucidation of changing secondary structures for interacting snRNAs is a subject of considerable interest and importance. Among the five snRNAs which participate in removal of spliceosomal introns, there is a growing consensus that U6-snRNA is the most structurally dynamic and may constitute the catalytic core. Previous studies by others have generated potential secondary structures for free U4- and U6-snRNAs, including the Y-shaped U4-/U6-snRNA model. These models were based on study of RNAs from relatively few species, and the popular Y-shaped model remains to be systematically re-examined with reference to the many new sequences generated by recent genomic sequencing projects. We have utilized a comparative phylogenetic approach on 60 diverse eukaryotic species, which resulted in a revised and improved U4-/U6-snRNA secondary structure. This general model is supported by observation of abundant compensatory base mutations in every stem, and incorporates more of the nucleotides into base-paired associations than in previous models, thus being more energetically stable. We have extensively sampled the eukaryotic phylogenetic tree to its deepest roots, but did not find genes potentially encoding either U4- or U6-snRNA in the Giardia and Trichomonas data-bases. Our results support the hypothesis that nuclear introns in these most deeply rooted eukaryotes may represent evolutionary intermediates, sharing characteristics of both group II and spliceosomal introns. An unexpected result of this study was discovery of a potential competitive binding site for Drosophila splicing assembly factor RNP-4F to a 5’-UTR regulatory region within its own premRNA, which may play a role in negative feedback control.
PMCID: PMC4237228  PMID: 25419488
RNP-4F; snRNA Secondary Structure; U4-/U6-snRNA Phylogeny; Spliceosome Evolution
3.  The U1, U2 and U5 snRNAs crosslink to the 5′ exon during yeast pre-mRNA splicing 
Nucleic Acids Research  2007;36(3):814-825.
Activation of pre-messenger RNA (pre-mRNA) splicing requires 5′ splice site recognition by U1 small nuclear RNA (snRNA), which is replaced by U5 and U6 snRNA. Here we use crosslinking to investigate snRNA interactions with the 5′ exon adjacent to the 5′ splice site, prior to the first step of splicing. U1 snRNA was found to interact with four different 5′ exon positions using one specific sequence adjacent to U1 snRNA helix 1. This novel interaction of U1 we propose occurs before U1-5′ splice site base pairing. In contrast, U5 snRNA interactions with the 5′ exon of the pre-mRNA progressively shift towards the 5′ end of U5 loop 1 as the crosslinking group is placed further from the 5′ splice site, with only interactions closest to the 5′ splice site persisting to the 5′ exon intermediate and the second step of splicing. A novel yeast U2 snRNA interaction with the 5′ exon was also identified, which is ATP dependent and requires U2-branchpoint interaction. This study provides insight into the nature and timing of snRNA interactions required for 5′ splice site recognition prior to the first step of pre-mRNA splicing.
PMCID: PMC2241886  PMID: 18084028
4.  Competing Upstream 5′ Splice Sites Enhance the Rate of Proximal Splicing▿  
Molecular and Cellular Biology  2010;30(8):1878-1886.
Alternative 5′ splice site selection is one of the major pathways resulting in mRNA diversification. Regulation of this type of alternative splicing depends on the presence of regulatory elements that activate or repress the use of competing splice sites, usually leading to the preferential use of the proximal splice site. However, the mechanisms involved in proximal splice site selection and the thermodynamic advantage realized by proximal splice sites are not well understood. Here, we have carried out a systematic analysis of alternative 5′ splice site usage using in vitro splicing assays. We show that observed rates of splicing correlate well with their U1 snRNA base pairing potential. Weak U1 snRNA interactions with the 5′ splice site were significantly rescued by the proximity of the downstream exon, demonstrating that the intron definition mode of splice site recognition is highly efficient. In the context of competing splice sites, the proximity to the downstream 3′ splice site was more influential in dictating splice site selection than the actual 5′ splice site/U1 snRNA base pairing potential. Surprisingly, the kinetic analysis also demonstrated that an upstream competing 5′ splice site enhances the rate of proximal splicing. These results reveal the discovery of a new splicing regulatory element, an upstream 5′ splice site functioning as a splicing enhancer.
PMCID: PMC2849477  PMID: 20123971
5.  Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions 
Nucleic Acids Research  2001;29(12):2581-2593.
It has been previously observed that the intrinsically weak variant GC donor sites, in order to be recognized by the U2-type spliceosome, possess strong consensus sequences maximized for base pair formation with U1 and U5/U6 snRNAs. However, variability in signal strength is a fundamental mechanism for splice site selection in alternative splicing. Here we report human alternative GC-AG introns (for the first time from any species), and show that while constitutive GC-AG introns do possess strong signals at their donor sites, a large subset of alternative GC-AG introns possess weak consensus sequences at their donor sites. Surprisingly, this subset of alternative isoforms shows strong consensus at acceptor exon positions 1 and 2. The improved consensus at the acceptor exon can facilitate a strong interaction with U5 snRNA, which tethers the two exons for ligation during the second step of splicing. Further, these isoforms nearly always possess alternative acceptor sites and exhibit particularly weak polypyrimidine tracts characteristic of AG-dependent introns. The acceptor exon nucleotides are part of the consensus required for the U2AF35-mediated recognition of AG in such introns. Such improved consensus at acceptor exons is not found in either normal or alternative GT-AG introns having weak donor sites or weak polypyrimidine tracts. The changes probably reflect mechanisms that allow GC-AG alternative intron isoforms to cope with two conflicting requirements, namely an apparent need for differential splice strength to direct the choice of alternative sites and a need for improved donor signals to compensate for the central mismatch base pair (C-A) in the RNA duplex of U1 snRNA and the pre-mRNA. The other important findings include (i) one in every twenty alternative introns is a GC-AG intron, and (ii) three of every five observed GC-AG introns are alternative isoforms.
PMCID: PMC55748  PMID: 11410667
6.  Uncoupling two functions of the U1 small nuclear ribonucleoprotein particle during in vitro splicing. 
Molecular and Cellular Biology  1993;13(6):3135-3145.
To probe functions of the U1 small nuclear ribonucleoprotein particle (snRNP) during in vitro splicing, we have used unusual splicing substrates which replace the 5' splice site region of an adenovirus substrate with spliced leader (SL) RNA sequences from Leptomonas collosoma or Caenorhabditis elegans. In agreement with previous results (J.P. Bruzik and J.A. Steitz, Cell 62:889-899, 1990), we find that oligonucleotide-targeted RNase H destruction of the 5' end of U1 snRNA inhibits the splicing of a standard adenovirus splicing substrate but not of the SL RNA-containing substrates. However, use of an antisense 2'-O-methyl oligoribonucleotide that disrupts the first stem of U1 snRNA as well as stably sequestering positions of U1 snRNA involved in 5' and 3' splice site recognition inhibits the splicing of both the SL constructs and the standard adenovirus substrate. The 2'-O-methyl oligoribonucleotide is no more effective than RNase H pretreatment in preventing pairing of U1 with the 5' splice site, as assessed by inhibition of psoralen cross-link formation between the SL RNA-containing substrate and U1. The 2'-O-methyl oligoribonucleotide does not alter the protein composition of the U1 monoparticle or deplete the system of essential splicing factors. Native gel analysis indicates that the 2'-O-methyl oligoribonucleotide inhibits splicing by diminishing the formation of splicing complexes. One interpretation of these results is that removal of the 5' end of U1 inhibits base pairing in a different way than sequestering the same sequence with a complementary oligoribonucleotide. Alternatively, our data may indicate that two elements near the 5' end of U1 RNA normally act during spliceosome assembly; the extreme 5' end base pairs with the 5' splice site, while the sequence or structural integrity of stem I is essential for some additional function. It follows that different introns may differ in their use of the repertoire of U1 snRNP functions.
PMCID: PMC359749  PMID: 7684489
7.  Evolutionarily divergent spliceosomal snRNAs and a conserved non-coding RNA processing motif in Giardia lamblia 
Nucleic Acids Research  2012;40(21):10995-11008.
Non-coding RNAs (ncRNAs) have diverse essential biological functions in all organisms, and in eukaryotes, two such classes of ncRNAs are the small nucleolar (sno) and small nuclear (sn) RNAs. In this study, we have identified and characterized a collection of sno and snRNAs in Giardia lamblia, by exploiting our discovery of a conserved 12 nt RNA processing sequence motif found in the 3′ end regions of a large number of G. lamblia ncRNA genes. RNA end mapping and other experiments indicate the motif serves to mediate ncRNA 3′ end formation from mono- and di-cistronic RNA precursor transcripts. Remarkably, we find the motif is also utilized in the processing pathway of all four previously identified trans-spliced G. lamblia introns, revealing a common RNA processing pathway for ncRNAs and trans-spliced introns in this organism. Motif sequence conservation then allowed for the bioinformatic and experimental identification of additional G. lamblia ncRNAs, including new U1 and U6 spliceosomal snRNA candidates. The U6 snRNA candidate was then used as a tool to identity novel U2 and U4 snRNAs, based on predicted phylogenetically conserved snRNA–snRNA base-pairing interactions, from a set of previously identified G. lamblia ncRNAs without assigned function. The Giardia snRNAs retain the core features of spliceosomal snRNAs but are sufficiently evolutionarily divergent to explain the difficulties in their identification. Most intriguingly, all of these snRNAs show structural features diagnostic of U2-dependent/major and U12-dependent/minor spliceosomal snRNAs.
PMCID: PMC3510501  PMID: 23019220
8.  Functionally important structural elements of U12 snRNA 
Nucleic Acids Research  2011;39(19):8531-8543.
U12 snRNA is analogous to U2 snRNA of the U2-dependent spliceosome and is essential for the splicing of U12-dependent introns in metazoan cells. The essential region of U12 snRNA, which base pairs to the branch site of minor class introns is well characterized. However, other regions which are outside of the branch site base pairing region are not yet characterized and the requirement of these structures in U12-dependent splicing is not clear. U12 snRNA is predicted to form an intricate secondary structure containing several stem–loops and single-stranded regions. Using a previously characterized branch site genetic suppression assay, we generated second-site mutations in the suppressor U12 snRNA to investigate the in vivo requirement of structural elements in U12-dependent splicing. Our results show that stem–loop IIa is essential and required for in vivo splicing. Interestingly, an evolutionarily conserved stem–loop IIb is dispensable for splicing. We also show that stem–loop III, which binds to a p65 RNA binding protein of the U11-U12 di.snRNP complex, is essential for in vivo splicing. The data validate the existence of proposed stem–loops of U12 snRNA and provide experimental support for individual secondary structures.
PMCID: PMC3201867  PMID: 21737423
9.  A novel approach to describe a U1 snRNA binding site 
Nucleic Acids Research  2003;31(23):6963-6975.
RNA duplex formation between U1 snRNA and a splice donor (SD) site can protect pre-mRNA from degradation prior to splicing and initiates formation of the spliceosome. This process was monitored, using sub-genomic HIV-1 expression vectors, by expression analysis of the glycoprotein env, whose formation critically depends on functional SD4. We systematically derived a hydrogen bond model for the complementarity between the free 5′ end of U1 snRNA and 5′ splice sites and numerous mutations following transient transfection of HeLa-T4+ cells with 5′ splice site mutated vectors. The resulting model takes into account number, interdependence and neighborhood relationships of predicted hydrogen bond formation in a region spanning the three most 3′ base pairs of the exon (–3 to –1) and the eight most 5′ base pairs of the intron (+1 to +8). The model is represented by an algorithm classifying U1 snRNA binding sites which can or cannot functionally substitute SD4 with respect to Rev-mediated env expression. In a data set of 5′ splice site mutations of the human ATM gene we found a significant correlation between the algorithmic classification and exon skipping (P = 0.018, χ2-test), showing that the applicability of the proposed model reaches far beyond HIV-1 splicing. However, the algorithmic classification must not be taken as an absolute measure of SD usage as it may be modified by upstream sequence elements. Upstream to SD4 we identified a fragment supporting ASF/SF2 binding. Mutating GAR nucleotide repeats within this site decreased the SD4-dependent Rev-mediated env expression, which could be balanced simply by artificially increasing the complementarity of SD4.
PMCID: PMC290269  PMID: 14627829
10.  Plant intron sequences: evidence for distinct groups of introns. 
Nucleic Acids Research  1988;16(14B):7159-7176.
In vivo and in vitro RNA splicing experiments have demonstrated that the intron splicing machineries are not interchangeable in all organisms. These differences have prevented the efficient in vivo expression of monocot genes containing introns in dicot plants and the in vitro excision of some plant introns in HeLa cell in vitro splicing extracts. We have analyzed plant introns for sequence differences which potentially account for the functional splicing differences. Three classes of plant introns can be differentiated by the purine or pyrimidine-richness of sequences upstream from the 3' splice site. The frequency of these three types of introns in monocots and dicots varies significantly. The degree of variability in the 5' and 3' intron boundaries is evaluated for each of these classes in monocots and dicots. The 5' splice site consensus sequences developed for the monocot and dicot introns differ in their ability to base pair with conserved nucleotides present at the 5' end of many U1 snRNAs.
PMCID: PMC338358  PMID: 3405760
11.  Activity of chimeric RNAs of U6 snRNA and (-)sTRSV in the cleavage of a substrate RNA. 
Nucleic Acids Research  1992;20(12):2991-2996.
U6 small nuclear RNA is one of the spliceosomal RNAs essential for pre-mRNA splicing. Discovery of mRNA-type introns in the highly conserved region of the U6 snRNA genes led to the hypothesis that U6 snRNA functions as a catalytic element during pre-mRNA splicing. The highly conserved region of U6 snRNA has a structural similarity with the catalytic domain of the negative strand of the satellite RNA of tobacco ring spot virus [(-)sTRSV], suggesting that the highly conserved region of U6 snRNA forms the catalytic center. We examined whether synthetic RNAs consisting of the sequence of the highly conserved region of U6 snRNA or various chimeric RNAs between the U6 region and the catalytic RNA of (-)sTRSV could cleave a substrate RNA that can partially base-pair with them and have a GU sequence. Chimeric RNAs with 70 to 83% sequence identity with the conserved region of S. pombe U6 snRNA cleaved the substrate RNA at the 5' side of the GU sequence, which is shared by the 5' end of an intron in a pre-mRNA. We found that the highly conserved region of U6 snRNA and the catalytic domain of (-)sTRSV are strikingly similar in structure to the catalytic core region of the group I self-splicing intron in cyanobacteria. These results suggest that U6 snRNA, (-)sTRSV and the group I self-splicing intron originated from a common ancestral RNA, and support the hypothesis that U6 snRNA catalyzes pre-mRNA splicing reaction.
PMCID: PMC312428  PMID: 1620594
12.  The 5' end domain of U2 snRNA is required to establish the interaction of U2 snRNP with U2 auxiliary factor(s) during mammalian spliceosome assembly. 
Nucleic Acids Research  1991;19(4):877-884.
Stable association of U2 snRNP with the branchpoint sequence of mammalian pre-mRNAs requires binding of a non-snRNP protein to the polypyrimidine tract. In order to determine how U2 snRNP contacts this protein, we have used an RNA containing the consensus 5' and the (Py)n-AG 3' splice sites but lacking the branchpoint sequence so as to prevent direct U2 snRNA base pairing to the branchpoint. Different approaches including electrophoretic separation of RNP complexes formed in nuclear extracts, RNase T1 protection immunoprecipitation assays with antibodies against snRNPs and UV cross-linking experiments coupled to immunoprecipitations allowed us to demonstrate that at least three splicing factors contact this RNA at 0 degree C without ATP. As expected, U1 snRNP interacts with the region comprising the 5' splice site. A protein of approximately 65,000 molecular weight recognizes the RNA specifically at the 5' boundary of the polypyrimidine tract. It could be either the U2 auxiliary factor (U2AF) (Zamore and Green (1989) PNAS 86, 9243-9247), the polypyrimidine tract binding protein (pPTB) (Garcia-Blanco et al. (1989) Genes and Dev. 3, 1874-1886) or a mixture of both. U2 snRNP also contacts the RNA in a way depending on p65 binding, thereby further arguing that the latter may correspond to the previously characterized U2AF and pPTB. Cleavage of U2 snRNA sequence by a complementary oligonucleotide and RNase H led us to conclude that the 5' terminus of U2 snRNA is required to ensure the contact between U2 snRNP and p65 bound to the RNA. More importantly, this conclusion can be extended to authentic pre-mRNAs. When we have used a human beta-globin pre-mRNA instead of the above artificial substrate, RNA bound p65 became precipitable by anti-(U2) RNP and anti-Sm antibodies except when the 5' end of U2 snRNA was selectively cleaved.
PMCID: PMC333725  PMID: 1850127
13.  An mRNA-type intron is present in the Rhodotorula hasegawae U2 small nuclear RNA gene. 
Molecular and Cellular Biology  1993;13(9):5613-5619.
Splicing an mRNA precursor requires multiple factors involving five small nuclear RNA (snRNA) species called U1, U2, U4, U5, and U6. The presence of mRNA-type introns in the U6 snRNA genes of some yeasts led to the hypothesis that U6 snRNA may play a catalytic role in pre-mRNA splicing and that the U6 introns occurred through reverse splicing of an intron from an mRNA precursor into a catalytic site of U6 snRNA. We characterized the U2 snRNA gene of the yeast Rhodotorula hasegawae, which has four mRNA-type introns in the U6 snRNA gene, and found an mRNA-type intron of 60 bp. The intron of the U2 snRNA gene is present in the highly conserved region immediately downstream of the branch site recognition domain. Interestingly, we found that this region can form a novel base pairing with U6 snRNA. We discuss the possible implications of these findings for the mechanisms of intron acquisition and for the role of U2 snRNA in pre-mRNA splicing.
PMCID: PMC360287  PMID: 8355704
14.  Sequence complementarity of U2 snRNA and U2A' intron predicts intron function 
Genome Biology  2005;6(4):P6.
This paper exemplifies a putative function of an intron RNA (i5e6i6) of the U2 small nuclear ribonucleoprotein particle (U2 snRNP) A' specific protein (U2A') pre mRNA. A possible RNA-RNA structure formed by complementary sequences in U2A'i5e6i6 and U2 snRNA is conserved in vertebrates, suggesting a role of U2A'i5e6i6 in the 3'end processing of U2 snRNA primary transcript.
The human genome contains about 24 % introns and only 1-2 % exons. Why such large amount of intron RNA is produced is not known. This paper exemplifies a putative function of an intron RNA, the alternatively spliced intron 5, exon 6 and intron 6 (i5e6i6) of the U2 small nuclear ribonucleoprotein particle (U2 snRNP) A' specific protein (U2A') pre mRNA. The U2 snRNP is a central component of the spliceosomes and very abundant in human nucleus. The U2 snRNP genes are tandemly repeated in the RNU2 locus which occasionally co-localize to Cajal bodies in a transcription dependent process not very well understood. We have earlier found that U2A' exon 6 that is skipped in alternative splicing, is highly conserved in its nucleotide sequence. In this paper I have searched for a possible function of the U2A’i5e6i6 RNA.
The U2A'i5e6i6 contains conserved sequence cassettes that are complementary to cassettes of the U2 snRNA. A possible RNA-RNA structure, based on RNA helices that may form by these complementary sequences, is presented. The structure, which is conserved in vertebrates, suggests a role of U2A'i5e6i6 in the 3'end processing of U2 snRNA primary transcript.
I predict a function of the U2A' i5e6i6 RNA in the 3’end processing of the U2 snRNA primary transcripts, a process that most probably occur during the RNU colocalization to Cajal Bodies. The production of U2 snRNPs would, thus be autoregulated by coupling of splicing efficiency of one of its components (U2A') to transcription of another (U2 snRNA). Such autoregulatory function may well be a common feature of introns.
PMCID: PMC4071252
15.  Domains of human U4atac snRNA required for U12-dependent splicing in vivo 
Nucleic Acids Research  2002;30(21):4650-4657.
U4atac snRNA forms a base-paired complex with U6atac snRNA. Both snRNAs are required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. We have developed a new genetic suppression assay to investigate the in vivo roles of several regions of U4atac snRNA in U12-dependent splicing. We show that both the stem I and stem II regions, which have been proposed to pair with U6atac snRNA, are required for in vivo splicing. Splicing activity also requires U4atac sequences in the 5′ stem–loop element that bind a 15.5 kDa protein that also binds to a similar region of U4 snRNA. In contrast, mutations in the region immediately following the stem I interaction region, as well as a deletion of the distal portion of the 3′ stem–loop element, were active for splicing. Complete deletion of the 3′ stem–loop element abolished in vivo splicing function as did a mutation of the Sm protein binding site. These results show that the in vivo sequence requirements of U4atac snRNA are similar to those described previously for U4 snRNA using in vitro assays and provide experimental support for models of the U4atac/U6atac snRNA interaction.
PMCID: PMC135832  PMID: 12409455
16.  Analysis of canonical and non-canonical splice sites in mammalian genomes 
Nucleic Acids Research  2000;28(21):4364-4375.
A set of 43 337 splice junction pairs was extracted from mammalian GenBank annotated genes. Expressed sequence tag (EST) sequences support 22 489 of them. Of these, 98.71% contain canonical dinucleotides GT and AG for donor and acceptor sites, respectively; 0.56% hold non-canonical GC-AG splice site pairs; and the remaining 0.73% occurs in a lot of small groups (with a maximum size of 0.05%). Studying these groups we observe that many of them contain splicing dinucleotides shifted from the annotated splice junction by one position. After close examination of such cases we present a new classification consisting of only eight observed types of splice site pairs (out of 256 a priori possible combinations). EST alignments allow us to verify the exonic part of the splice sites, but many non-canonical cases may be due to intron sequencing errors. This idea is given substantial support when we compare the sequences of human genes having non-canonical splice sites deposited in GenBank by high throughput genome sequencing projects (HTG). A high proportion (156 out of 171) of the human non-canonical and EST-supported splice site sequences had a clear match in the human HTG. They can be classified after corrections as: 79 GC-AG pairs (of which one was an error that corrected to GC-AG), 61 errors that were corrected to GT-AG canonical pairs, six AT-AC pairs (of which two were errors that corrected to AT-AC), one case was produced from non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two cases left of supported non-canonical splice sites. If we assume that approximately the same situation is true for the whole set of annotated mammalian non-canonical splice sites, then the 99.24% of splice site pairs should be GT-AG, 0.69% GC-AG, 0.05% AT-AC and finally only 0.02% could consist of other types of non-canonical splice sites. We analyze several characteristics of EST-verified splice sites and build weight matrices for the major groups, which can be incorporated into gene prediction programs. We also present a set of EST-verified canonical splice sites larger by two orders of magnitude than the current one (22 199 entries versus ~600) and finally, a set of 290 EST-supported non-canonical splice sites. Both sets should be significant for future investigations of the splicing mechanism.
PMCID: PMC113136  PMID: 11058137
17.  Genome-Wide Association between Branch Point Properties and Alternative Splicing 
PLoS Computational Biology  2010;6(11):e1001016.
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3′ end of introns, with distance to the 3′ splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.
Author Summary
From transcription to translation, the events underlying protein production from DNA sequence are paramount to all aspects of cellular function. Pre-mRNAs in eukaryotes undergo several processing steps prior to their export to the cytoplasm. Among these, splicing – the process of intron removal and exon ligation – has been shown to play a central role in the regulation of gene expression. It has been estimated that more than half of the disease-causing mutations in humans do so by interfering with splicing. The difficulty in describing these disease mechanisms often lies in the low accuracy of the methods for prediction of functional splicing signals in the pre-mRNA. This is especially the case of the branch point, mainly due to its high sequence variability. We have developed a methodology for mammalian branch point prediction based on a machine-learning algorithm, which shows improved accuracy over previous published methods. Moreover, using a combination of experimental and bioinformatics approaches, we uncovered important positional properties of the branch point and shed new light on how some of its features may contribute to the final splicing outcome. These findings might prove useful for a better understanding of how splicing-associated mutations can lead to disease.
PMCID: PMC2991248  PMID: 21124863
18.  PRP38 encodes a yeast protein required for pre-mRNA splicing and maintenance of stable U6 small nuclear RNA levels. 
Molecular and Cellular Biology  1992;12(9):3939-3947.
An essential pre-mRNA splicing factor, the product of the PRP38 gene, has been genetically identified in a screen of temperature-sensitive mutants of Saccharomyces cerevisiae. Shifting temperature-sensitive prp38 cultures from 23 to 37 degrees C prevents the first cleavage-ligation event in the excision of introns from mRNA precursors. In vitro splicing inactivation and complementation studies suggest that the PRP38-encoded factor functions, at least in part, after stable splicing complex formation. The PRP38 locus contains a 726-bp open reading frame coding for an acidic 28-kDa polypeptide (PRP38). While PRP38 lacks obvious structural similarity to previously defined splicing factors, heat inactivation of PRP38, PRP19, or any of the known U6 (or U4/U6) small nuclear ribonucleoprotein-associating proteins (i.e., PRP3, PRP4, PRP6, and PRP24) leads to a common, unexpected consequence: intracellular U6 small nuclear RNA (snRNA) levels decrease as splicing activity is lost. Curiously, U4 snRNA, normally extensively base paired with U6 snRNA, persists in the virtual absence of U6 snRNA.
PMCID: PMC360275  PMID: 1508195
19.  The 5'-terminal sequence of U1 RNA complementary to the consensus 5' splice site of hnRNA is single-stranded in intact U1 snRNP particles. 
Nucleic Acids Research  1984;12(10):4111-4126.
The 5'-terminal region of U1 snRNA is highly complementary to the consensus exon-intron regions of hnRNA and it has been suggested that U1 snRNP might play a role in the splicing of the pre-mRNA by intermolecular base-pairing between these regions. Here the secondary structure of the 5' terminus of U1 RNA in the isolated native U1 snRNP particle has been investigated by site-directed enzymatic cleavage of the RNA. Individual oligodeoxynucleotides complementary to various sequences within the first 15 nucleotides of the 5' terminus of U1 RNA have been tested for their ability to form stable DNA X RNA hybrids, with subsequent cleavage of the U1 RNA by RNase H. Our results show unequivocally that the 9 nucleotides at the 5' terminus which are complementary to a consensus 5' splice site are indeed single-stranded in the intact U1 snRNP particle, and are not protected by snRNP proteins. However, they also indicate that the U1 sequence complementary to an intron's consensus 3' end is not readily available for intermolecular base-pairing, either in the intact U1 snRNP particle or in the deproteinized U1 RNA molecule. Therefore our data favour the possibility that U1 snRNP plays a role only in the recognition of a 5' splice site of hnRNA, rather than being involved in the alignment of both ends of an intron for splicing.
PMCID: PMC318820  PMID: 6203096
20.  The Pivotal Roles of TIA Proteins in 5′ Splice-Site Selection of Alu Exons and Across Evolution 
PLoS Genetics  2009;5(11):e1000717.
More than 5% of alternatively spliced internal exons in the human genome are derived from Alu elements in a process termed exonization. Alus are comprised of two homologous arms separated by an internal polypyrimidine tract (PPT). In most exonizations, splice sites are selected from within the same arm. We hypothesized that the internal PPT may prevent selection of a splice site further downstream. Here, we demonstrate that this PPT enhanced the selection of an upstream 5′ splice site (5′ss), even in the presence of a stronger 5′ss downstream. Deletion of this PPT shifted selection to the stronger downstream 5′ss. This enhancing effect depended on the strength of the downstream 5′ss, on the efficiency of base-pairing to U1 snRNA, and on the length of the PPT. This effect of the PPT was mediated by the binding of TIA proteins and was dependent on the distance between the PPT and the upstream 5′ss. A wide-scale evolutionary analysis of introns across 22 eukaryotes revealed an enrichment in PPTs within ∼20 nt downstream of the 5′ss. For most metazoans, the strength of the 5′ss inversely correlated with the presence of a downstream PPT, indicative of the functional role of the PPT. Finally, we found that the proteins that mediate this effect, TIA and U1C, and in particular their functional domains, are highly conserved across evolution. Overall, these findings expand our understanding of the role of TIA1/TIAR proteins in enhancing recognition of exons, in general, and Alu exons, in particular.
Author Summary
Human genes are composed of functional regions, termed exons, separated by non-functional regions, termed introns. Intronic sequences may gradually accumulate mutations and subsequently become recognized by the splicing machinery as exons, a process termed exonization. Alu elements are prone to undergo exonization: more than 5% of alternatively spliced internal exons in the human genome originate from Alu elements. A typical Alu element is ∼300 nucleotides long, consisting of two arms separated by a polypyrimdine tract (PPT). Interestingly, in most cases, exonization occurs almost exclusively within either the right arm or the left, not both. Here we found that the PPT between the two arms serves as a binding site for TIA proteins and prevents the exon selection process from expanding into downstream regions. To obtain a wider overview of TIA function, we performed a cross-evolutionary analysis within 22 eukaryotes of this protein and of U1C, a protein known to interact with it, and found that functional regions of both these proteins were highly conserved. These findings highlight the pivotal role of TIA proteins in 5′ splice-site selection of Alu exons and exon recognition in general.
PMCID: PMC2766253  PMID: 19911040
21.  Trans-splicing to Spliceosomal U2 snRNA Suggests Disruption of Branch Site-U2 Pairing During pre-mRNA Splicing 
Molecular cell  2007;26(6):883-890.
Pairing between U2 snRNA and the branch site of spliceosomal introns is essential for spliceosome assembly and is thought to be required for the first catalytic step of splicing. We have identified an RNA comprising the 5’ end of U2 snRNA and the 3’ exon of the ACT1-CUP1 reporter gene, resulting from a trans-splicing reaction in which a 5’ splice site-like sequence in the universally conserved branch site-binding region of U2 is used in trans as a 5’ splice site for both steps of splicing in vivo. Formation of this product occurs in functional spliceosomes assembled on reporter genes whose 5’ splice sites are predicted to bind poorly at the spliceosome catalytic centre. Multiple spatially disparate splice sites in U2 can be used, calling into question both the fate of its pairing to the branch site and the details of its role in splicing catalysis.
PMCID: PMC1973159  PMID: 17588521
U2 snRNA; branch site; trans-splicing; bulged duplex model; splicing catalysis
22.  Pre-mRNA Splicing Is a Determinant of Nucleosome Organization 
PLoS ONE  2013;8(1):e53506.
Chromatin organization affects alternative splicing and previous studies have shown that exons have increased nucleosome occupancy compared with their flanking introns. To determine whether alternative splicing affects chromatin organization we developed a system in which the alternative splicing pattern switched from inclusion to skipping as a function of time. Changes in nucleosome occupancy were correlated with the change in the splicing pattern. Surprisingly, strengthening of the 5′ splice site or strengthening the base pairing of U1 snRNA with an internal exon abrogated the skipping of the internal exons and also affected chromatin organization. Over-expression of splicing regulatory proteins also affected the splicing pattern and changed nucleosome occupancy. A specific splicing inhibitor was used to show that splicing impacts nucleosome organization endogenously. The effect of splicing on the chromatin required a functional U1 snRNA base pairing with the 5′ splice site, but U1 pairing was not essential for U1 snRNA enhancement of transcription. Overall, these results suggest that splicing can affect chromatin organization.
PMCID: PMC3542351  PMID: 23326444
23.  Mutational analysis of Saccharomyces cerevisiae U4 small nuclear RNA identifies functionally important domains. 
Molecular and Cellular Biology  1995;15(3):1274-1285.
U4 small nuclear RNA (snRNA) is essential for pre-mRNA splicing, although its role is not yet clear. On the basis of a model structure (C. Guthrie and B. Patterson, Annu. Rev. Genet. 22:387-419, 1988), the molecule can be thought of as having six domains: stem II, 5' stem-loop, stem I, central region, 3' stem-loop, and 3'-terminal region. We have carried out extensive mutagenesis of the yeast U4 snRNA gene (SNR14) and have obtained information on the effect of mutations at 105 of its 160 nucleotides. Fifteen critical residues in the U4 snRNA have been identified in four domains: stem II, the 5' stem-loop, stem I, and the 3'-terminal region. These domains have been shown previously to be insensitive to oligonucleotide-directed RNase H cleavage (Y. Xu, S. Petersen-Bjørn, and J. D. Friesen, Mol. Cell. Biol. 10:1217-1225, 1990), suggesting that they are involved in intra- or intermolecular interactions. Stem II, a region that base pairs with U6 snRNA, is the most sensitive to mutation of all U4 snRNA domains. In contrast, stem I is surprisingly insensitive to mutational change, which brings into question its role in base pairing with U6 snRNA. All mutations in the putative Sm site of U4 snRNA yield a lethal or conditional-lethal phenotype, indicating that this region is important functionally. Only two nucleotides in the 5' stem-loop are sensitive to mutation; most of this domain can tolerate point mutations or small deletions. The 3' stem-loop, while essential, is very tolerant of change. A large portion of the central domain can be removed or expanded with only minor effects on phenotype, suggesting that it has little function of its own. Analysis of conditional mutations in stem II and stem I indicates that although these single-base changes do not have a dramatic effect on U4 snRNA stability, they are defective in RNA splicing in vivo and in vitro, as well as in spliceosome assembly. These results are discussed in the context of current knowledge of the interactions involving U4 snRNA.
PMCID: PMC230350  PMID: 7862121
24.  The spliceosomal snRNAs of Caenorhabditis elegans. 
Nucleic Acids Research  1990;18(9):2633-2642.
Nematodes are the only group of organisms in which both cis- and trans-splicing of nuclear mRNAs are known to occur. Most Caenorhabditis elegans introns are exceptionally short, often only 50 bases long. The consensus donor and acceptor splice site sequences found in other animals are used for both cis- and trans-splicing. In order to identify the machinery required for these splicing events, we have characterized the C. elegans snRNAs. They are similar in sequence and structure to those characterized in other organisms, and several sequence variations discovered in the nematode snRNAs provide support for previously proposed structure models. The C. elegans snRNAs are encoded by gene families. We report here the sequences of many of these genes. We find a highly conserved sequence, the proximal sequence element (PSE), about 65 bp upstream of all 21 snRNA genes thus far sequenced, including the SL RNA genes, which specify the snRNAs that provide the 5' exons in trans-splicing. The sequence of the C. elegans PSE is distinct from PSE's from other organisms.
PMCID: PMC330746  PMID: 2339054
25.  Multiple functional domains of human U2 small nuclear RNA: strengthening conserved stem I can block splicing. 
Molecular and Cellular Biology  1992;12(12):5464-5473.
We showed previously that a branch site mutation in simian virus 40 early pre-mRNA that prevented small t antigen mRNA splicing could be efficiently suppressed by a compensatory mutation in a coexpressed U2 small nuclear (sn) RNA gene. We have now generated second-site mutations in this suppressor gene to investigate regions of U2 RNA required for function. A number of mutations in a putative stem at the 5' end of the molecule inhibited splicing, indicating that bases in this region are important for activity. However, several lines of evidence suggested that formation of the entire stem is not essential for splicing. Indeed, mutations that strengthen the stem actually inhibited splicing, and evidence that this prevents a required base-pairing interaction with U6 snRNA is presented. These results suggest that the relative stabilities of competing intra- and intermolecular base-pairing interactions play an important role in the splicing reaction. Mutations in a conserved single-stranded region immediately 3' to the branch site recognition sequence all inhibited splicing, indicating that this region is required for U2 function, although its exact role remains unknown. Finally, two mutations in the loop of stem IV at the 3' end of the molecule, which destroy the binding site of U2 sn ribonucleoprotein B", prevented small t splicing; this finding contrasts with previous studies which utilized different assay systems. Analysis of the accumulation and subcellular localization of all of the mutant RNAs showed that they were similar to those of the parental suppressor U2 RNA, indicating that the effects observed indeed reflect defects in splicing.
PMCID: PMC360484  PMID: 1448079

Results 1-25 (920942)