Search tips
Search criteria

Results 1-25 (1129017)

Clipboard (0)

Related Articles

1.  A phylogenetic study of Drosophila splicing assembly chaperone RNP-4F associated U4-/U6-snRNA secondary structure 
The rnp-4f gene in Drosophila melanogaster encodes nuclear protein RNP-4F. This encoded protein is represented by homologs in other eukaryotic species, where it has been shown to function as an intron splicing assembly factor. Here, RNP-4F is believed to initially bind to a recognition sequence on U6-snRNA, serving as a chaperone to facilitate its association with U4-snRNA by intermolecular hydrogen bonding. RNA conformations are a key factor in spliceosome function, so that elucidation of changing secondary structures for interacting snRNAs is a subject of considerable interest and importance. Among the five snRNAs which participate in removal of spliceosomal introns, there is a growing consensus that U6-snRNA is the most structurally dynamic and may constitute the catalytic core. Previous studies by others have generated potential secondary structures for free U4- and U6-snRNAs, including the Y-shaped U4-/U6-snRNA model. These models were based on study of RNAs from relatively few species, and the popular Y-shaped model remains to be systematically re-examined with reference to the many new sequences generated by recent genomic sequencing projects. We have utilized a comparative phylogenetic approach on 60 diverse eukaryotic species, which resulted in a revised and improved U4-/U6-snRNA secondary structure. This general model is supported by observation of abundant compensatory base mutations in every stem, and incorporates more of the nucleotides into base-paired associations than in previous models, thus being more energetically stable. We have extensively sampled the eukaryotic phylogenetic tree to its deepest roots, but did not find genes potentially encoding either U4- or U6-snRNA in the Giardia and Trichomonas data-bases. Our results support the hypothesis that nuclear introns in these most deeply rooted eukaryotes may represent evolutionary intermediates, sharing characteristics of both group II and spliceosomal introns. An unexpected result of this study was discovery of a potential competitive binding site for Drosophila splicing assembly factor RNP-4F to a 5’-UTR regulatory region within its own premRNA, which may play a role in negative feedback control.
PMCID: PMC4237228  PMID: 25419488
RNP-4F; snRNA Secondary Structure; U4-/U6-snRNA Phylogeny; Spliceosome Evolution
2.  Alternative Splicing of RNA Triplets Is Often Regulated and Accelerates Proteome Evolution 
PLoS Biology  2012;10(1):e1001229.
Inclusion or exclusion of single codons at the splice acceptor site of mammalian genes is regulated in a tissue-specific manner, is strongly conserved, and is associated with local accelerated protein evolution.
Thousands of human genes contain introns ending in NAGNAG (N any nucleotide), where both NAGs can function as 3′ splice sites, yielding isoforms that differ by inclusion/exclusion of three bases. However, few models exist for how such splicing might be regulated, and some studies have concluded that NAGNAG splicing is purely stochastic and nonfunctional. Here, we used deep RNA-Seq data from 16 human and eight mouse tissues to analyze the regulation and evolution of NAGNAG splicing. Using both biological and technical replicates to estimate false discovery rates, we estimate that at least 25% of alternatively spliced NAGNAGs undergo tissue-specific regulation in mammals, and alternative splicing of strongly tissue-specific NAGNAGs was 10 times as likely to be conserved between species as was splicing of non-tissue-specific events, implying selective maintenance. Preferential use of the distal NAG was associated with distinct sequence features, including a more distal location of the branch point and presence of a pyrimidine immediately before the first NAG, and alteration of these features in a splicing reporter shifted splicing away from the distal site. Strikingly, alignments of orthologous exons revealed a ∼15-fold increase in the frequency of three base pair gaps at 3′ splice sites relative to nearby exon positions in both mammals and in Drosophila. Alternative splicing of NAGNAGs in human was associated with dramatically increased frequency of exon length changes at orthologous exon boundaries in rodents, and a model involving point mutations that create, destroy, or alter NAGNAGs can explain both the increased frequency and biased codon composition of gained/lost sequence observed at the beginnings of exons. This study shows that NAGNAG alternative splicing generates widespread differences between the proteomes of mammalian tissues, and suggests that the evolutionary trajectories of mammalian proteins are strongly biased by the locations and phases of the introns that interrupt coding sequences.
Author Summary
In order to translate a gene into protein, all of the non-coding regions (introns) need to be removed from the transcript and the coding regions (exons) stitched back together to make an mRNA. Most human genes are alternatively spliced, allowing the selection of different combinations of exons to produce multiple distinct mRNAs and proteins. Many types of alternative splicing are known to play crucial roles in biological processes including cell fate determination, tumor metabolism, and apoptosis. In this study, we investigated a form of alternative splicing in which competing adjacent 3′ splice sites (or splice acceptor sites) generate mRNAs differing by just an RNA triplet, the size of a single codon. This mode of alternative splicing, known as NAGNAG splicing, affects thousands of human genes and has been known for a decade, but its potential regulation, physiological importance, and conservation across species have been disputed. Using high-throughput sequencing of cDNA (“RNA-Seq”) from human and mouse tissues, we found that single-codon splicing often shows strong tissue specificity. Regulated NAGNAG alternative splice sites are selectively conserved between human and mouse genes, suggesting that they are important for organismal fitness. We identified features of the competing splice sites that influence NAGNAG splicing, and validated their effects in cultured cells. Furthermore, we found that this mode of splicing is associated with accelerated and highly biased protein evolution at exon boundaries. Taken together, our analyses demonstrate that the inclusion or exclusion of RNA triplets at exon boundaries can be effectively regulated by the splicing machinery, and highlight an unexpected connection between RNA processing and protein evolution.
PMCID: PMC3250501  PMID: 22235189
3.  Evolutionarily divergent spliceosomal snRNAs and a conserved non-coding RNA processing motif in Giardia lamblia 
Nucleic Acids Research  2012;40(21):10995-11008.
Non-coding RNAs (ncRNAs) have diverse essential biological functions in all organisms, and in eukaryotes, two such classes of ncRNAs are the small nucleolar (sno) and small nuclear (sn) RNAs. In this study, we have identified and characterized a collection of sno and snRNAs in Giardia lamblia, by exploiting our discovery of a conserved 12 nt RNA processing sequence motif found in the 3′ end regions of a large number of G. lamblia ncRNA genes. RNA end mapping and other experiments indicate the motif serves to mediate ncRNA 3′ end formation from mono- and di-cistronic RNA precursor transcripts. Remarkably, we find the motif is also utilized in the processing pathway of all four previously identified trans-spliced G. lamblia introns, revealing a common RNA processing pathway for ncRNAs and trans-spliced introns in this organism. Motif sequence conservation then allowed for the bioinformatic and experimental identification of additional G. lamblia ncRNAs, including new U1 and U6 spliceosomal snRNA candidates. The U6 snRNA candidate was then used as a tool to identity novel U2 and U4 snRNAs, based on predicted phylogenetically conserved snRNA–snRNA base-pairing interactions, from a set of previously identified G. lamblia ncRNAs without assigned function. The Giardia snRNAs retain the core features of spliceosomal snRNAs but are sufficiently evolutionarily divergent to explain the difficulties in their identification. Most intriguingly, all of these snRNAs show structural features diagnostic of U2-dependent/major and U12-dependent/minor spliceosomal snRNAs.
PMCID: PMC3510501  PMID: 23019220
4.  The U1, U2 and U5 snRNAs crosslink to the 5′ exon during yeast pre-mRNA splicing 
Nucleic Acids Research  2007;36(3):814-825.
Activation of pre-messenger RNA (pre-mRNA) splicing requires 5′ splice site recognition by U1 small nuclear RNA (snRNA), which is replaced by U5 and U6 snRNA. Here we use crosslinking to investigate snRNA interactions with the 5′ exon adjacent to the 5′ splice site, prior to the first step of splicing. U1 snRNA was found to interact with four different 5′ exon positions using one specific sequence adjacent to U1 snRNA helix 1. This novel interaction of U1 we propose occurs before U1-5′ splice site base pairing. In contrast, U5 snRNA interactions with the 5′ exon of the pre-mRNA progressively shift towards the 5′ end of U5 loop 1 as the crosslinking group is placed further from the 5′ splice site, with only interactions closest to the 5′ splice site persisting to the 5′ exon intermediate and the second step of splicing. A novel yeast U2 snRNA interaction with the 5′ exon was also identified, which is ATP dependent and requires U2-branchpoint interaction. This study provides insight into the nature and timing of snRNA interactions required for 5′ splice site recognition prior to the first step of pre-mRNA splicing.
PMCID: PMC2241886  PMID: 18084028
5.  Competing Upstream 5′ Splice Sites Enhance the Rate of Proximal Splicing▿  
Molecular and Cellular Biology  2010;30(8):1878-1886.
Alternative 5′ splice site selection is one of the major pathways resulting in mRNA diversification. Regulation of this type of alternative splicing depends on the presence of regulatory elements that activate or repress the use of competing splice sites, usually leading to the preferential use of the proximal splice site. However, the mechanisms involved in proximal splice site selection and the thermodynamic advantage realized by proximal splice sites are not well understood. Here, we have carried out a systematic analysis of alternative 5′ splice site usage using in vitro splicing assays. We show that observed rates of splicing correlate well with their U1 snRNA base pairing potential. Weak U1 snRNA interactions with the 5′ splice site were significantly rescued by the proximity of the downstream exon, demonstrating that the intron definition mode of splice site recognition is highly efficient. In the context of competing splice sites, the proximity to the downstream 3′ splice site was more influential in dictating splice site selection than the actual 5′ splice site/U1 snRNA base pairing potential. Surprisingly, the kinetic analysis also demonstrated that an upstream competing 5′ splice site enhances the rate of proximal splicing. These results reveal the discovery of a new splicing regulatory element, an upstream 5′ splice site functioning as a splicing enhancer.
PMCID: PMC2849477  PMID: 20123971
6.  A novel approach to describe a U1 snRNA binding site 
Nucleic Acids Research  2003;31(23):6963-6975.
RNA duplex formation between U1 snRNA and a splice donor (SD) site can protect pre-mRNA from degradation prior to splicing and initiates formation of the spliceosome. This process was monitored, using sub-genomic HIV-1 expression vectors, by expression analysis of the glycoprotein env, whose formation critically depends on functional SD4. We systematically derived a hydrogen bond model for the complementarity between the free 5′ end of U1 snRNA and 5′ splice sites and numerous mutations following transient transfection of HeLa-T4+ cells with 5′ splice site mutated vectors. The resulting model takes into account number, interdependence and neighborhood relationships of predicted hydrogen bond formation in a region spanning the three most 3′ base pairs of the exon (–3 to –1) and the eight most 5′ base pairs of the intron (+1 to +8). The model is represented by an algorithm classifying U1 snRNA binding sites which can or cannot functionally substitute SD4 with respect to Rev-mediated env expression. In a data set of 5′ splice site mutations of the human ATM gene we found a significant correlation between the algorithmic classification and exon skipping (P = 0.018, χ2-test), showing that the applicability of the proposed model reaches far beyond HIV-1 splicing. However, the algorithmic classification must not be taken as an absolute measure of SD usage as it may be modified by upstream sequence elements. Upstream to SD4 we identified a fragment supporting ASF/SF2 binding. Mutating GAR nucleotide repeats within this site decreased the SD4-dependent Rev-mediated env expression, which could be balanced simply by artificially increasing the complementarity of SD4.
PMCID: PMC290269  PMID: 14627829
7.  Activity of chimeric RNAs of U6 snRNA and (-)sTRSV in the cleavage of a substrate RNA. 
Nucleic Acids Research  1992;20(12):2991-2996.
U6 small nuclear RNA is one of the spliceosomal RNAs essential for pre-mRNA splicing. Discovery of mRNA-type introns in the highly conserved region of the U6 snRNA genes led to the hypothesis that U6 snRNA functions as a catalytic element during pre-mRNA splicing. The highly conserved region of U6 snRNA has a structural similarity with the catalytic domain of the negative strand of the satellite RNA of tobacco ring spot virus [(-)sTRSV], suggesting that the highly conserved region of U6 snRNA forms the catalytic center. We examined whether synthetic RNAs consisting of the sequence of the highly conserved region of U6 snRNA or various chimeric RNAs between the U6 region and the catalytic RNA of (-)sTRSV could cleave a substrate RNA that can partially base-pair with them and have a GU sequence. Chimeric RNAs with 70 to 83% sequence identity with the conserved region of S. pombe U6 snRNA cleaved the substrate RNA at the 5' side of the GU sequence, which is shared by the 5' end of an intron in a pre-mRNA. We found that the highly conserved region of U6 snRNA and the catalytic domain of (-)sTRSV are strikingly similar in structure to the catalytic core region of the group I self-splicing intron in cyanobacteria. These results suggest that U6 snRNA, (-)sTRSV and the group I self-splicing intron originated from a common ancestral RNA, and support the hypothesis that U6 snRNA catalyzes pre-mRNA splicing reaction.
PMCID: PMC312428  PMID: 1620594
8.  Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions 
Nucleic Acids Research  2001;29(12):2581-2593.
It has been previously observed that the intrinsically weak variant GC donor sites, in order to be recognized by the U2-type spliceosome, possess strong consensus sequences maximized for base pair formation with U1 and U5/U6 snRNAs. However, variability in signal strength is a fundamental mechanism for splice site selection in alternative splicing. Here we report human alternative GC-AG introns (for the first time from any species), and show that while constitutive GC-AG introns do possess strong signals at their donor sites, a large subset of alternative GC-AG introns possess weak consensus sequences at their donor sites. Surprisingly, this subset of alternative isoforms shows strong consensus at acceptor exon positions 1 and 2. The improved consensus at the acceptor exon can facilitate a strong interaction with U5 snRNA, which tethers the two exons for ligation during the second step of splicing. Further, these isoforms nearly always possess alternative acceptor sites and exhibit particularly weak polypyrimidine tracts characteristic of AG-dependent introns. The acceptor exon nucleotides are part of the consensus required for the U2AF35-mediated recognition of AG in such introns. Such improved consensus at acceptor exons is not found in either normal or alternative GT-AG introns having weak donor sites or weak polypyrimidine tracts. The changes probably reflect mechanisms that allow GC-AG alternative intron isoforms to cope with two conflicting requirements, namely an apparent need for differential splice strength to direct the choice of alternative sites and a need for improved donor signals to compensate for the central mismatch base pair (C-A) in the RNA duplex of U1 snRNA and the pre-mRNA. The other important findings include (i) one in every twenty alternative introns is a GC-AG intron, and (ii) three of every five observed GC-AG introns are alternative isoforms.
PMCID: PMC55748  PMID: 11410667
9.  An mRNA-type intron is present in the Rhodotorula hasegawae U2 small nuclear RNA gene. 
Molecular and Cellular Biology  1993;13(9):5613-5619.
Splicing an mRNA precursor requires multiple factors involving five small nuclear RNA (snRNA) species called U1, U2, U4, U5, and U6. The presence of mRNA-type introns in the U6 snRNA genes of some yeasts led to the hypothesis that U6 snRNA may play a catalytic role in pre-mRNA splicing and that the U6 introns occurred through reverse splicing of an intron from an mRNA precursor into a catalytic site of U6 snRNA. We characterized the U2 snRNA gene of the yeast Rhodotorula hasegawae, which has four mRNA-type introns in the U6 snRNA gene, and found an mRNA-type intron of 60 bp. The intron of the U2 snRNA gene is present in the highly conserved region immediately downstream of the branch site recognition domain. Interestingly, we found that this region can form a novel base pairing with U6 snRNA. We discuss the possible implications of these findings for the mechanisms of intron acquisition and for the role of U2 snRNA in pre-mRNA splicing.
PMCID: PMC360287  PMID: 8355704
10.  Plant intron sequences: evidence for distinct groups of introns. 
Nucleic Acids Research  1988;16(14B):7159-7176.
In vivo and in vitro RNA splicing experiments have demonstrated that the intron splicing machineries are not interchangeable in all organisms. These differences have prevented the efficient in vivo expression of monocot genes containing introns in dicot plants and the in vitro excision of some plant introns in HeLa cell in vitro splicing extracts. We have analyzed plant introns for sequence differences which potentially account for the functional splicing differences. Three classes of plant introns can be differentiated by the purine or pyrimidine-richness of sequences upstream from the 3' splice site. The frequency of these three types of introns in monocots and dicots varies significantly. The degree of variability in the 5' and 3' intron boundaries is evaluated for each of these classes in monocots and dicots. The 5' splice site consensus sequences developed for the monocot and dicot introns differ in their ability to base pair with conserved nucleotides present at the 5' end of many U1 snRNAs.
PMCID: PMC338358  PMID: 3405760
11.  Crystal structure of human U1 snRNP, a small nuclear ribonucleoprotein particle, reveals the mechanism of 5′ splice site recognition 
eLife  null;4:e04986.
U1 snRNP binds to the 5′ exon-intron junction of pre-mRNA and thus plays a crucial role at an early stage of pre-mRNA splicing. We present two crystal structures of engineered U1 sub-structures, which together reveal at atomic resolution an almost complete network of protein–protein and RNA-protein interactions within U1 snRNP, and show how the 5′ splice site of pre-mRNA is recognised by U1 snRNP. The zinc-finger of U1-C interacts with the duplex between pre-mRNA and the 5′-end of U1 snRNA. The binding of the RNA duplex is stabilized by hydrogen bonds and electrostatic interactions between U1-C and the RNA backbone around the splice junction but U1-C makes no base-specific contacts with pre-mRNA. The structure, together with RNA binding assays, shows that the selection of 5′-splice site nucleotides by U1 snRNP is achieved predominantly through basepairing with U1 snRNA whilst U1-C fine-tunes relative affinities of mismatched 5′-splice sites.
eLife digest
Genes are made up of long stretches of DNA. The regions of a gene that code for proteins (known as exons) are interrupted by stretches of non-coding DNA called introns. To produce proteins from a gene, the DNA is ‘transcribed’ to form pre-mRNA molecules, from which the introns must be removed in a process called splicing. The remaining exons are then joined together to form a mature mRNA molecule that contains the instructions to build a protein. Errors in the splicing process can lead to numerous diseases, such as cancer.
A molecular machine known as a spliceosome is responsible for splicing the pre-mRNA molecules. This consists of five different complexes called small nuclear ribonucleoprotein particles (snRNPs), which are in turn made up from numerous proteins and RNA molecules. The spliceosome assembles anew every time it splices, and an early step in this assembly process involves the interaction of an snRNP called U1 with the start of an intron in the pre-mRNA. This interaction then stimulates the assembly of the rest of the spliceosome. In 2009, researchers reported the structure of the U1 snRNP, but the structure did not contain enough detail to reveal how the snRNP recognizes the start of an intron.
Kondo, Oubridge et al., including some of the researchers involved in the 2009 work, now present the crystal structure of the human version of the U1 snRNP in more detail. High-quality crystal structures of the complete U1 snRNP molecule could not be obtained because the arrangement of the RNA molecules in the snRNP prevented a regular crystal from forming. Kondo, Oubridge et al. instead engineered two subcomponents of U1 snRNP that each crystallized well, and determined their structures. This revealed that the interactions between the various parts of the U1 snRNP form a complex network.
A protein present in the U1 snRNP, known as U1-C, had previously been reported to be able to recognize introns on its own—without requiring the complete U1 snRNP. Kondo, Oubridge et al. reveal that this is not the case and that U1-C does not read the intron RNA sequence directly. Instead, U1 snRNP is able to find the start of the intron because the U1 RNA can stably bind to this site. The U1-C protein can however adjust the strength of this binding to ensure that the spliceosome can operate with a variety of intron start sequences (or signals).
PMCID: PMC4383343  PMID: 25555158
pre-mRNA splicing; crystallography; spliceosome; U1 snRNP; 5′ splice site; human
12.  The 5' end domain of U2 snRNA is required to establish the interaction of U2 snRNP with U2 auxiliary factor(s) during mammalian spliceosome assembly. 
Nucleic Acids Research  1991;19(4):877-884.
Stable association of U2 snRNP with the branchpoint sequence of mammalian pre-mRNAs requires binding of a non-snRNP protein to the polypyrimidine tract. In order to determine how U2 snRNP contacts this protein, we have used an RNA containing the consensus 5' and the (Py)n-AG 3' splice sites but lacking the branchpoint sequence so as to prevent direct U2 snRNA base pairing to the branchpoint. Different approaches including electrophoretic separation of RNP complexes formed in nuclear extracts, RNase T1 protection immunoprecipitation assays with antibodies against snRNPs and UV cross-linking experiments coupled to immunoprecipitations allowed us to demonstrate that at least three splicing factors contact this RNA at 0 degree C without ATP. As expected, U1 snRNP interacts with the region comprising the 5' splice site. A protein of approximately 65,000 molecular weight recognizes the RNA specifically at the 5' boundary of the polypyrimidine tract. It could be either the U2 auxiliary factor (U2AF) (Zamore and Green (1989) PNAS 86, 9243-9247), the polypyrimidine tract binding protein (pPTB) (Garcia-Blanco et al. (1989) Genes and Dev. 3, 1874-1886) or a mixture of both. U2 snRNP also contacts the RNA in a way depending on p65 binding, thereby further arguing that the latter may correspond to the previously characterized U2AF and pPTB. Cleavage of U2 snRNA sequence by a complementary oligonucleotide and RNase H led us to conclude that the 5' terminus of U2 snRNA is required to ensure the contact between U2 snRNP and p65 bound to the RNA. More importantly, this conclusion can be extended to authentic pre-mRNAs. When we have used a human beta-globin pre-mRNA instead of the above artificial substrate, RNA bound p65 became precipitable by anti-(U2) RNP and anti-Sm antibodies except when the 5' end of U2 snRNA was selectively cleaved.
PMCID: PMC333725  PMID: 1850127
13.  Uncoupling two functions of the U1 small nuclear ribonucleoprotein particle during in vitro splicing. 
Molecular and Cellular Biology  1993;13(6):3135-3145.
To probe functions of the U1 small nuclear ribonucleoprotein particle (snRNP) during in vitro splicing, we have used unusual splicing substrates which replace the 5' splice site region of an adenovirus substrate with spliced leader (SL) RNA sequences from Leptomonas collosoma or Caenorhabditis elegans. In agreement with previous results (J.P. Bruzik and J.A. Steitz, Cell 62:889-899, 1990), we find that oligonucleotide-targeted RNase H destruction of the 5' end of U1 snRNA inhibits the splicing of a standard adenovirus splicing substrate but not of the SL RNA-containing substrates. However, use of an antisense 2'-O-methyl oligoribonucleotide that disrupts the first stem of U1 snRNA as well as stably sequestering positions of U1 snRNA involved in 5' and 3' splice site recognition inhibits the splicing of both the SL constructs and the standard adenovirus substrate. The 2'-O-methyl oligoribonucleotide is no more effective than RNase H pretreatment in preventing pairing of U1 with the 5' splice site, as assessed by inhibition of psoralen cross-link formation between the SL RNA-containing substrate and U1. The 2'-O-methyl oligoribonucleotide does not alter the protein composition of the U1 monoparticle or deplete the system of essential splicing factors. Native gel analysis indicates that the 2'-O-methyl oligoribonucleotide inhibits splicing by diminishing the formation of splicing complexes. One interpretation of these results is that removal of the 5' end of U1 inhibits base pairing in a different way than sequestering the same sequence with a complementary oligoribonucleotide. Alternatively, our data may indicate that two elements near the 5' end of U1 RNA normally act during spliceosome assembly; the extreme 5' end base pairs with the 5' splice site, while the sequence or structural integrity of stem I is essential for some additional function. It follows that different introns may differ in their use of the repertoire of U1 snRNP functions.
PMCID: PMC359749  PMID: 7684489
14.  Sequence complementarity of U2 snRNA and U2A' intron predicts intron function 
Genome Biology  2005;6(4):P6.
This paper exemplifies a putative function of an intron RNA (i5e6i6) of the U2 small nuclear ribonucleoprotein particle (U2 snRNP) A' specific protein (U2A') pre mRNA. A possible RNA-RNA structure formed by complementary sequences in U2A'i5e6i6 and U2 snRNA is conserved in vertebrates, suggesting a role of U2A'i5e6i6 in the 3'end processing of U2 snRNA primary transcript.
The human genome contains about 24 % introns and only 1-2 % exons. Why such large amount of intron RNA is produced is not known. This paper exemplifies a putative function of an intron RNA, the alternatively spliced intron 5, exon 6 and intron 6 (i5e6i6) of the U2 small nuclear ribonucleoprotein particle (U2 snRNP) A' specific protein (U2A') pre mRNA. The U2 snRNP is a central component of the spliceosomes and very abundant in human nucleus. The U2 snRNP genes are tandemly repeated in the RNU2 locus which occasionally co-localize to Cajal bodies in a transcription dependent process not very well understood. We have earlier found that U2A' exon 6 that is skipped in alternative splicing, is highly conserved in its nucleotide sequence. In this paper I have searched for a possible function of the U2A’i5e6i6 RNA.
The U2A'i5e6i6 contains conserved sequence cassettes that are complementary to cassettes of the U2 snRNA. A possible RNA-RNA structure, based on RNA helices that may form by these complementary sequences, is presented. The structure, which is conserved in vertebrates, suggests a role of U2A'i5e6i6 in the 3'end processing of U2 snRNA primary transcript.
I predict a function of the U2A' i5e6i6 RNA in the 3’end processing of the U2 snRNA primary transcripts, a process that most probably occur during the RNU colocalization to Cajal Bodies. The production of U2 snRNPs would, thus be autoregulated by coupling of splicing efficiency of one of its components (U2A') to transcription of another (U2 snRNA). Such autoregulatory function may well be a common feature of introns.
PMCID: PMC4071252
15.  Analysis of canonical and non-canonical splice sites in mammalian genomes 
Nucleic Acids Research  2000;28(21):4364-4375.
A set of 43 337 splice junction pairs was extracted from mammalian GenBank annotated genes. Expressed sequence tag (EST) sequences support 22 489 of them. Of these, 98.71% contain canonical dinucleotides GT and AG for donor and acceptor sites, respectively; 0.56% hold non-canonical GC-AG splice site pairs; and the remaining 0.73% occurs in a lot of small groups (with a maximum size of 0.05%). Studying these groups we observe that many of them contain splicing dinucleotides shifted from the annotated splice junction by one position. After close examination of such cases we present a new classification consisting of only eight observed types of splice site pairs (out of 256 a priori possible combinations). EST alignments allow us to verify the exonic part of the splice sites, but many non-canonical cases may be due to intron sequencing errors. This idea is given substantial support when we compare the sequences of human genes having non-canonical splice sites deposited in GenBank by high throughput genome sequencing projects (HTG). A high proportion (156 out of 171) of the human non-canonical and EST-supported splice site sequences had a clear match in the human HTG. They can be classified after corrections as: 79 GC-AG pairs (of which one was an error that corrected to GC-AG), 61 errors that were corrected to GT-AG canonical pairs, six AT-AC pairs (of which two were errors that corrected to AT-AC), one case was produced from non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two cases left of supported non-canonical splice sites. If we assume that approximately the same situation is true for the whole set of annotated mammalian non-canonical splice sites, then the 99.24% of splice site pairs should be GT-AG, 0.69% GC-AG, 0.05% AT-AC and finally only 0.02% could consist of other types of non-canonical splice sites. We analyze several characteristics of EST-verified splice sites and build weight matrices for the major groups, which can be incorporated into gene prediction programs. We also present a set of EST-verified canonical splice sites larger by two orders of magnitude than the current one (22 199 entries versus ~600) and finally, a set of 290 EST-supported non-canonical splice sites. Both sets should be significant for future investigations of the splicing mechanism.
PMCID: PMC113136  PMID: 11058137
16.  Genome-Wide Association between Branch Point Properties and Alternative Splicing 
PLoS Computational Biology  2010;6(11):e1001016.
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3′ end of introns, with distance to the 3′ splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.
Author Summary
From transcription to translation, the events underlying protein production from DNA sequence are paramount to all aspects of cellular function. Pre-mRNAs in eukaryotes undergo several processing steps prior to their export to the cytoplasm. Among these, splicing – the process of intron removal and exon ligation – has been shown to play a central role in the regulation of gene expression. It has been estimated that more than half of the disease-causing mutations in humans do so by interfering with splicing. The difficulty in describing these disease mechanisms often lies in the low accuracy of the methods for prediction of functional splicing signals in the pre-mRNA. This is especially the case of the branch point, mainly due to its high sequence variability. We have developed a methodology for mammalian branch point prediction based on a machine-learning algorithm, which shows improved accuracy over previous published methods. Moreover, using a combination of experimental and bioinformatics approaches, we uncovered important positional properties of the branch point and shed new light on how some of its features may contribute to the final splicing outcome. These findings might prove useful for a better understanding of how splicing-associated mutations can lead to disease.
PMCID: PMC2991248  PMID: 21124863
17.  PRP38 encodes a yeast protein required for pre-mRNA splicing and maintenance of stable U6 small nuclear RNA levels. 
Molecular and Cellular Biology  1992;12(9):3939-3947.
An essential pre-mRNA splicing factor, the product of the PRP38 gene, has been genetically identified in a screen of temperature-sensitive mutants of Saccharomyces cerevisiae. Shifting temperature-sensitive prp38 cultures from 23 to 37 degrees C prevents the first cleavage-ligation event in the excision of introns from mRNA precursors. In vitro splicing inactivation and complementation studies suggest that the PRP38-encoded factor functions, at least in part, after stable splicing complex formation. The PRP38 locus contains a 726-bp open reading frame coding for an acidic 28-kDa polypeptide (PRP38). While PRP38 lacks obvious structural similarity to previously defined splicing factors, heat inactivation of PRP38, PRP19, or any of the known U6 (or U4/U6) small nuclear ribonucleoprotein-associating proteins (i.e., PRP3, PRP4, PRP6, and PRP24) leads to a common, unexpected consequence: intracellular U6 small nuclear RNA (snRNA) levels decrease as splicing activity is lost. Curiously, U4 snRNA, normally extensively base paired with U6 snRNA, persists in the virtual absence of U6 snRNA.
PMCID: PMC360275  PMID: 1508195
18.  Functionally important structural elements of U12 snRNA 
Nucleic Acids Research  2011;39(19):8531-8543.
U12 snRNA is analogous to U2 snRNA of the U2-dependent spliceosome and is essential for the splicing of U12-dependent introns in metazoan cells. The essential region of U12 snRNA, which base pairs to the branch site of minor class introns is well characterized. However, other regions which are outside of the branch site base pairing region are not yet characterized and the requirement of these structures in U12-dependent splicing is not clear. U12 snRNA is predicted to form an intricate secondary structure containing several stem–loops and single-stranded regions. Using a previously characterized branch site genetic suppression assay, we generated second-site mutations in the suppressor U12 snRNA to investigate the in vivo requirement of structural elements in U12-dependent splicing. Our results show that stem–loop IIa is essential and required for in vivo splicing. Interestingly, an evolutionarily conserved stem–loop IIb is dispensable for splicing. We also show that stem–loop III, which binds to a p65 RNA binding protein of the U11-U12 di.snRNP complex, is essential for in vivo splicing. The data validate the existence of proposed stem–loops of U12 snRNA and provide experimental support for individual secondary structures.
PMCID: PMC3201867  PMID: 21737423
19.  The Pivotal Roles of TIA Proteins in 5′ Splice-Site Selection of Alu Exons and Across Evolution 
PLoS Genetics  2009;5(11):e1000717.
More than 5% of alternatively spliced internal exons in the human genome are derived from Alu elements in a process termed exonization. Alus are comprised of two homologous arms separated by an internal polypyrimidine tract (PPT). In most exonizations, splice sites are selected from within the same arm. We hypothesized that the internal PPT may prevent selection of a splice site further downstream. Here, we demonstrate that this PPT enhanced the selection of an upstream 5′ splice site (5′ss), even in the presence of a stronger 5′ss downstream. Deletion of this PPT shifted selection to the stronger downstream 5′ss. This enhancing effect depended on the strength of the downstream 5′ss, on the efficiency of base-pairing to U1 snRNA, and on the length of the PPT. This effect of the PPT was mediated by the binding of TIA proteins and was dependent on the distance between the PPT and the upstream 5′ss. A wide-scale evolutionary analysis of introns across 22 eukaryotes revealed an enrichment in PPTs within ∼20 nt downstream of the 5′ss. For most metazoans, the strength of the 5′ss inversely correlated with the presence of a downstream PPT, indicative of the functional role of the PPT. Finally, we found that the proteins that mediate this effect, TIA and U1C, and in particular their functional domains, are highly conserved across evolution. Overall, these findings expand our understanding of the role of TIA1/TIAR proteins in enhancing recognition of exons, in general, and Alu exons, in particular.
Author Summary
Human genes are composed of functional regions, termed exons, separated by non-functional regions, termed introns. Intronic sequences may gradually accumulate mutations and subsequently become recognized by the splicing machinery as exons, a process termed exonization. Alu elements are prone to undergo exonization: more than 5% of alternatively spliced internal exons in the human genome originate from Alu elements. A typical Alu element is ∼300 nucleotides long, consisting of two arms separated by a polypyrimdine tract (PPT). Interestingly, in most cases, exonization occurs almost exclusively within either the right arm or the left, not both. Here we found that the PPT between the two arms serves as a binding site for TIA proteins and prevents the exon selection process from expanding into downstream regions. To obtain a wider overview of TIA function, we performed a cross-evolutionary analysis within 22 eukaryotes of this protein and of U1C, a protein known to interact with it, and found that functional regions of both these proteins were highly conserved. These findings highlight the pivotal role of TIA proteins in 5′ splice-site selection of Alu exons and exon recognition in general.
PMCID: PMC2766253  PMID: 19911040
20.  Trans-splicing to Spliceosomal U2 snRNA Suggests Disruption of Branch Site-U2 Pairing During pre-mRNA Splicing 
Molecular cell  2007;26(6):883-890.
Pairing between U2 snRNA and the branch site of spliceosomal introns is essential for spliceosome assembly and is thought to be required for the first catalytic step of splicing. We have identified an RNA comprising the 5’ end of U2 snRNA and the 3’ exon of the ACT1-CUP1 reporter gene, resulting from a trans-splicing reaction in which a 5’ splice site-like sequence in the universally conserved branch site-binding region of U2 is used in trans as a 5’ splice site for both steps of splicing in vivo. Formation of this product occurs in functional spliceosomes assembled on reporter genes whose 5’ splice sites are predicted to bind poorly at the spliceosome catalytic centre. Multiple spatially disparate splice sites in U2 can be used, calling into question both the fate of its pairing to the branch site and the details of its role in splicing catalysis.
PMCID: PMC1973159  PMID: 17588521
U2 snRNA; branch site; trans-splicing; bulged duplex model; splicing catalysis
21.  Mutational analysis of Saccharomyces cerevisiae U4 small nuclear RNA identifies functionally important domains. 
Molecular and Cellular Biology  1995;15(3):1274-1285.
U4 small nuclear RNA (snRNA) is essential for pre-mRNA splicing, although its role is not yet clear. On the basis of a model structure (C. Guthrie and B. Patterson, Annu. Rev. Genet. 22:387-419, 1988), the molecule can be thought of as having six domains: stem II, 5' stem-loop, stem I, central region, 3' stem-loop, and 3'-terminal region. We have carried out extensive mutagenesis of the yeast U4 snRNA gene (SNR14) and have obtained information on the effect of mutations at 105 of its 160 nucleotides. Fifteen critical residues in the U4 snRNA have been identified in four domains: stem II, the 5' stem-loop, stem I, and the 3'-terminal region. These domains have been shown previously to be insensitive to oligonucleotide-directed RNase H cleavage (Y. Xu, S. Petersen-Bjørn, and J. D. Friesen, Mol. Cell. Biol. 10:1217-1225, 1990), suggesting that they are involved in intra- or intermolecular interactions. Stem II, a region that base pairs with U6 snRNA, is the most sensitive to mutation of all U4 snRNA domains. In contrast, stem I is surprisingly insensitive to mutational change, which brings into question its role in base pairing with U6 snRNA. All mutations in the putative Sm site of U4 snRNA yield a lethal or conditional-lethal phenotype, indicating that this region is important functionally. Only two nucleotides in the 5' stem-loop are sensitive to mutation; most of this domain can tolerate point mutations or small deletions. The 3' stem-loop, while essential, is very tolerant of change. A large portion of the central domain can be removed or expanded with only minor effects on phenotype, suggesting that it has little function of its own. Analysis of conditional mutations in stem II and stem I indicates that although these single-base changes do not have a dramatic effect on U4 snRNA stability, they are defective in RNA splicing in vivo and in vitro, as well as in spliceosome assembly. These results are discussed in the context of current knowledge of the interactions involving U4 snRNA.
PMCID: PMC230350  PMID: 7862121
22.  The spliceosomal snRNAs of Caenorhabditis elegans. 
Nucleic Acids Research  1990;18(9):2633-2642.
Nematodes are the only group of organisms in which both cis- and trans-splicing of nuclear mRNAs are known to occur. Most Caenorhabditis elegans introns are exceptionally short, often only 50 bases long. The consensus donor and acceptor splice site sequences found in other animals are used for both cis- and trans-splicing. In order to identify the machinery required for these splicing events, we have characterized the C. elegans snRNAs. They are similar in sequence and structure to those characterized in other organisms, and several sequence variations discovered in the nematode snRNAs provide support for previously proposed structure models. The C. elegans snRNAs are encoded by gene families. We report here the sequences of many of these genes. We find a highly conserved sequence, the proximal sequence element (PSE), about 65 bp upstream of all 21 snRNA genes thus far sequenced, including the SL RNA genes, which specify the snRNAs that provide the 5' exons in trans-splicing. The sequence of the C. elegans PSE is distinct from PSE's from other organisms.
PMCID: PMC330746  PMID: 2339054
23.  Oriented Scanning Is the Leading Mechanism Underlying 5′ Splice Site Selection in Mammals 
PLoS Genetics  2006;2(9):e138.
Splice site selection is a key element of pre-mRNA splicing. Although it is known to involve specific recognition of short consensus sequences by the splicing machinery, the mechanisms by which 5′ splice sites are accurately identified remain controversial and incompletely resolved. The human F7 gene contains in its seventh intron (IVS7) a 37-bp VNTR minisatellite whose first element spans the exon7–IVS7 boundary. As a consequence, the IVS7 authentic donor splice site is followed by several cryptic splice sites identical in sequence, referred to as 5′ pseudo-sites, which normally remain silent. This region, therefore, provides a remarkable model to decipher the mechanism underlying 5′ splice site selection in mammals. We previously suggested a model for splice site selection that, in the presence of consecutive splice consensus sequences, would stimulate exclusively the selection of the most upstream 5′ splice site, rather than repressing the 3′ following pseudo-sites. In the present study, we provide experimental support to this hypothesis by using a mutational approach involving a panel of 50 mutant and wild-type F7 constructs expressed in various cell types. We demonstrate that the F7 IVS7 5′ pseudo-sites are functional, but do not compete with the authentic donor splice site. Moreover, we show that the selection of the 5′ splice site follows a scanning-type mechanism, precluding competition with other functional 5′ pseudo-sites available on immediate sequence context downstream of the activated one. In addition, 5′ pseudo-sites with an increased complementarity to U1snRNA up to 91% do not compete with the identified scanning mechanism. Altogether, these findings, which unveil a cell type–independent 5′−3′-oriented scanning process for accurate recognition of the authentic 5′ splice site, reconciliate apparently contradictory observations by establishing a hierarchy of competitiveness among the determinants involved in 5′ splice site selection.
Typically, mammalian genes contain coding sequences (exons) separated by non-coding sequences (introns). Introns are removed during pre-mRNA splicing. The accurate recognition of introns during splicing is essential, as any abnormality in that process will generate abnormal mRNAs that can cause diseases. Understanding the mechanisms of accurate splice site selection is of prime interest to life scientists. Exon–intron borders (splice sites) are defined by short sequences that are poorly conserved. The strength of any splice sequence can be assessed by its degree of homology with a splice site consensus sequence. Within exons and introns, several sequences can match with this consensus as well as or better than the splice sites. Using a system in which a splice site sequence is repeated several times in the intron, the authors showed that linear 5′−3′ search is a leading mechanism underlying splice site selection. This scanning mechanism is cell type–independent, and only the most upstream splice site of all the series is selected, even if splice sites with a better match to the consensus are in the vicinity. These findings reconciliate contradictory observations and establish a hierarchy among the determinants involved in splice site selection.
PMCID: PMC1557585  PMID: 16948532
24.  Domains of human U4atac snRNA required for U12-dependent splicing in vivo 
Nucleic Acids Research  2002;30(21):4650-4657.
U4atac snRNA forms a base-paired complex with U6atac snRNA. Both snRNAs are required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. We have developed a new genetic suppression assay to investigate the in vivo roles of several regions of U4atac snRNA in U12-dependent splicing. We show that both the stem I and stem II regions, which have been proposed to pair with U6atac snRNA, are required for in vivo splicing. Splicing activity also requires U4atac sequences in the 5′ stem–loop element that bind a 15.5 kDa protein that also binds to a similar region of U4 snRNA. In contrast, mutations in the region immediately following the stem I interaction region, as well as a deletion of the distal portion of the 3′ stem–loop element, were active for splicing. Complete deletion of the 3′ stem–loop element abolished in vivo splicing function as did a mutation of the Sm protein binding site. These results show that the in vivo sequence requirements of U4atac snRNA are similar to those described previously for U4 snRNA using in vitro assays and provide experimental support for models of the U4atac/U6atac snRNA interaction.
PMCID: PMC135832  PMID: 12409455
25.  Multiple functional domains of human U2 small nuclear RNA: strengthening conserved stem I can block splicing. 
Molecular and Cellular Biology  1992;12(12):5464-5473.
We showed previously that a branch site mutation in simian virus 40 early pre-mRNA that prevented small t antigen mRNA splicing could be efficiently suppressed by a compensatory mutation in a coexpressed U2 small nuclear (sn) RNA gene. We have now generated second-site mutations in this suppressor gene to investigate regions of U2 RNA required for function. A number of mutations in a putative stem at the 5' end of the molecule inhibited splicing, indicating that bases in this region are important for activity. However, several lines of evidence suggested that formation of the entire stem is not essential for splicing. Indeed, mutations that strengthen the stem actually inhibited splicing, and evidence that this prevents a required base-pairing interaction with U6 snRNA is presented. These results suggest that the relative stabilities of competing intra- and intermolecular base-pairing interactions play an important role in the splicing reaction. Mutations in a conserved single-stranded region immediately 3' to the branch site recognition sequence all inhibited splicing, indicating that this region is required for U2 function, although its exact role remains unknown. Finally, two mutations in the loop of stem IV at the 3' end of the molecule, which destroy the binding site of U2 sn ribonucleoprotein B", prevented small t splicing; this finding contrasts with previous studies which utilized different assay systems. Analysis of the accumulation and subcellular localization of all of the mutant RNAs showed that they were similar to those of the parental suppressor U2 RNA, indicating that the effects observed indeed reflect defects in splicing.
PMCID: PMC360484  PMID: 1448079

Results 1-25 (1129017)