Approximately half of the introns in Drosophila melanogaster are too small to function in a vertebrate and often lack the pyrimidine tract associated with vertebrate 3' splice sites. Here, we report the splicing and spliceosome assembly properties of two such introns: one with a pyrimidine-poor 3' splice site and one with a pyrimidine-rich 3' splice site. The pyrimidine-poor intron was absolutely dependent on its small size for in vivo and in vitro splicing and assembly. As such, it had properties reminiscent of those of yeast introns. The pyrimidine-rich intron had properties intermediate between those of yeasts and vertebrates. This 3' splice site directed assembly of ATP-dependent complexes when present as either an intron or exon and supported low levels of in vivo splicing of a moderate-length intron. We propose that splice sites can be recognized as pairs across either exons or introns, depending on which distance is shorter, and that a pyrimidine-rich region upstream of the 3' splice site facilitates the exon mode.
cis-spliced nuclear pre-mRNA introns found in a variety of organisms, including Tetrahymena thermophila, Drosophila melanogaster, Caenorhabditis elegans, and plants, are significantly richer in adenosine and uridine residues than their flanking exons are. The functional significance of this intronic AU richness, however, has been demonstrated only in plant nuclei. In these nuclei, 5' and 3' splice sites are selected in part by their positions relative to AU-rich elements spread throughout the length of an intron. Because of this position-dependent selection scheme, a 5' splice site at the normal (+1) exon-intron boundary having only three contiguous consensus nucleotides can compete effectively with an enhanced exonic site (-57E) having nine consensus nucleotides and outcompete an enhanced site (+106E) embedded within the AU-rich intron. To determine whether transitions from AU-poor exonic sequences to AU-rich intronic sequences influence 5' splice site selection in other organisms, alleles of the pea rbcS3A1 intron were expressed in Drosophila Schneider 2 cells, and their splicing patterns were compared with those in tobacco nuclei. We demonstrate that this heterologous transcript can be accurately spliced in transfected Drosophila nuclei and that a +1 G-to-A knockout mutation at the normal splice site activates the same three cryptic 5' splice sites as in tobacco. Enhancement of the exonic (-57) and intronic (+106) sites to consensus splice sites indicates that potential splice sites located in the upstream exon or at the 5' exon-intron boundary are preferred in Drosophila cells over those embedded within AU-rich intronic sequences. In contrast to tobacco, in which the activities of two competing 5' splice sites upstream of the AU-rich intron are modulated by their proximity to the AU transition point, D. melanogaster utilizes the upstream site which has a higher proportion of consensus nucleotides. The enhanced version of the cryptic intronic site is efficiently selected in D. melanogaster when the normal +1 site is weakened or discrete AU-rich elements upstream of the +106E site are disrupted. Selection of this internal site in tobacco requires more drastic disruption of these motifs. We conclude that 5' splice site selection in Drosophila nuclei is influenced by the intrinsic strengths of competing sites and by the presence of AU-rich intronic elements but to a different extent than in tobacco.
Pre-mRNA splicing is carried out by the spliceosome, which identifies exons and removes intervening introns. In vertebrates, most splice sites are initially recognized by the spliceosome across the exon, because most exons are small and surrounded by large introns. This gene architecture predicts that efficient exon recognition depends largely on the strength of the flanking 3′ and 5′ splice sites. However, it is unknown if the 3′ or the 5′ splice site dominates the exon recognition process. Here, we test the 3′ and 5′ splice site contributions towards efficient exon recognition by systematically replacing the splice sites of an internal exon with sequences of different splice site strengths. We show that the presence of an optimal splice site does not guarantee exon inclusion and that the best predictor for exon recognition is the sum of both splice site scores. Using a genome-wide approach, we demonstrate that the combined 3′ and 5′ splice site strengths of internal exons provide a much more significant separator between constitutive and alternative exons than either the 3′ or the 5′ splice site strength alone.
Both experimental work and surveys of the lengths of internal exons in nature have suggested that vertebrate internal exons require a minimum size of approximately 50 nucleotides for efficient inclusion in mature mRNA. This phenomenon has been ascribed to steric interference between complexes involved in recognition of the splicing signals at the two ends of short internal exons. To determine whether U1 small nuclear ribonucleoprotein, a multicomponent splicing factor that is involved in the first recognition of splice sites, contributes to the lower size limit of vertebrate internal exons, we have taken advantage of our previous observation that U1 small nuclear RNAs (snRNAs) which bind upstream or downstream of the 5' splice site (5'SS) stimulate splicing of the upstream intron. By varying the position of U1 binding relative to the 3'SS, we show that U1-dependent splicing of the upstream intron becomes inefficient when U1 is positioned 48 nucleotides or less downstream of the 3'SS, suggesting a minimal distance between U1 and the 3'SS of approximately 50 nucleotides. This distance corresponds well to the suggested minimum size of internal exons. The results of experiments in which the 3'SS region of the reporter was duplicated suggest an optimal distance of greater than 72 nucleotides. We have also found that inclusion of a 24-nucleotide miniexon is promoted by the binding of U1 to the downstream intron but not by binding to the 5'SS. Our results are discussed in the context of models to explain constitutive splicing of small exons in nature.
Auxiliary splicing signals play a major role in the regulation of constitutive and alternative pre-mRNA splicing, but their relative importance in selection of mutation-induced cryptic or de novo splice sites is poorly understood. Here, we show that exonic sequences between authentic and aberrant splice sites that were activated by splice-site mutations in human disease genes have lower frequencies of splicing enhancers and higher frequencies of splicing silencers than average exons. Conversely, sequences between authentic and intronic aberrant splice sites have more enhancers and less silencers than average introns. Exons that were skipped as a result of splice-site mutations were smaller, had lower SF2/ASF motif scores, a decreased availability of decoy splice sites and a higher density of silencers than exons in which splice-site mutation activated cryptic splice sites. These four variables were the strongest predictors of the two aberrant splicing events in a logistic regression model. Elimination or weakening of predicted silencers in two reporters consistently promoted use of intron-proximal splice sites if these elements were maintained at their original positions, with their modular combinations producing expected modification of splicing. Together, these results show the existence of a gradient in exon and intron definition at the level of pre-mRNA splicing and provide a basis for the development of computational tools that predict aberrant splicing outcomes.
Pseudo-exons are intronic sequences that are flanked by apparent consensus splice sites but that are not observed in spliced mRNAs. Pseudo-exons are often difficult to activate by mutation and have typically been viewed as a conceptual challenge to our understanding of how the spliceosome discriminates between authentic and cryptic splice sites. We have analyzed an apparent pseudo-exon located downstream of mutually exclusive exons 2 and 3 of the rat α-tropomyosin (TM) gene. The TM pseudo-exon is conserved among mammals and has a conserved profile of predicted splicing enhancers and silencers that is more typical of a genuine exon than a pseudo-exon. Splicing of the pseudo-exon is fully activated for splicing to exon 3 by a number of simple mutations. Splicing of the pseudo-exon to exon 3 is predicted to lead to nonsense-mediated decay (NMD). In contrast, when “prespliced” to exon 2 it follows a “zero length exon” splicing pathway in which a newly generated 5′ splice site at the junction with exon 2 is spliced to exon 4. We propose that a subset of apparent pseudo-exons, as exemplified here, are actually authentic alternative exons whose inclusion leads to NMD.
Incorporation of exon 11 of the insulin receptor gene is both developmentally and hormonally-regulated. Previously, we have shown the presence of enhancer and silencer elements that modulate the incorporation of the small 36-nucleotide exon. In this study, we investigated the role of inherent splice site strength in the alternative splicing decision and whether recognition of the splice sites is the major determinant of exon incorporation.
We found that mutation of the flanking sub-optimal splice sites to consensus sequences caused the exon to be constitutively spliced in-vivo. These findings are consistent with the exon-definition model for splicing. In-vitro splicing of RNA templates containing exon 11 and portions of the upstream intron recapitulated the regulation seen in-vivo. Unexpectedly, we found that the splice sites are occupied and spliceosomal complex A was assembled on all templates in-vitro irrespective of splicing efficiency.
These findings demonstrate that the exon-definition model explains alternative splicing of exon 11 in the IR gene in-vivo but not in-vitro. The in-vitro results suggest that the regulation occurs at a later step in spliceosome assembly on this exon.
Pre-mRNA transcripts in a variety of organisms, including plants, Drosophila and Caenorhabditis elegans, contain introns which are significantly richer in adenosine and uridine residues than their flanking exons. Previous analyses using exonic and intronic replacements between two nonequivalent 5'splice sites in the 469 nt long rbcS3A intron 1 provided the first evidence indicating that, in both tobacco and Drosophila nuclei, 5'splice site selection is strongly influenced by the position of that site relative to the AU transition point between exon and intron. To differentiate between two potential models for 5'splice site recognition, we have expressed a completely different set of intronic and exonic replacement constructs containing identical 5'splice sites upstream of beta-conglycinin intron 4 (115 nt). Mutagenesis and deletion of the upstream 5'splice site demonstrate that intronic AU-rich sequences function by promoting recognition of the most upstream 5'splice site rather than by masking the downstream 5'splice site. Sequence insertions define a role for AG-rich exonic sequences in plant pre-mRNA splicing by demonstrating that an AG-rich element is capable of promoting downstream 5'splice site recognition. We conclude that AU-rich intronic sequences, AG-rich exonic sequences and the 5'splice site itself collectively define 5'intron boundaries in dicot nuclei.
Inefficient splicing of human immunodeficiency virus type 1 (HIV-1) RNA is necessary to preserve unspliced and singly spliced viral RNAs for transport to the cytoplasm by the Rev-dependent pathway. Signals within the HIV-1 genome that control the rate of splicing include weak 3′ splice sites, exon splicing enhancers (ESE), and exon splicing silencers (ESS). We have previously shown that an ESS present within tat exon 2 (ESS2) and a suboptimal 3′ splice site together act to inhibit splicing at the 3′ splice site flanking tat exon 2. This occurs at an early step in spliceosome assembly. Splicing at the 3′ splice site flanking tat exon 3 is regulated by a bipartite element composed of an ESE and an ESS (ESS3). Here we show that ESS3 is composed of two smaller elements (AGAUCC and UUAG) that can inhibit splicing independently. We also show that ESS3 is more active in the context of a heterologous suboptimal splice site than of an optimal 3′ splice site. ESS3 inhibits splicing by blocking the formation of a functional spliceosome at an early step, since A complexes are not detected in the presence of ESS3. Competitor RNAs containing either ESS2 or ESS3 relieve inhibition of splicing of substrates containing ESS3 or ESS2. This suggests that a common cellular factor(s) may be required for the inhibition of tat mRNA splicing mediated by ESS2 and ESS3.
The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests a novel mechanism for intron recognition that compensates for a weakened canonical pre-mRNA splicing motif.
While the current model of pre-mRNA splicing is based on the recognition of four canonical intronic motifs (5' splice site, branchpoint sequence, polypyrimidine (PY) tract and 3' splice site), it is becoming increasingly clear that splicing is regulated by both canonical and non-canonical splicing signals located in the RNA sequence of introns and exons that act to recruit the spliceosome and associated splicing factors. The diversity of human intronic sequences suggests the existence of novel recognition pathways for non-canonical introns. This study addresses the recognition and splicing of human introns that lack a canonical PY tract. The PY tract is a uridine-rich region at the 3' end of introns that acts as a binding site for U2AF65, a key factor in splicing machinery recruitment.
Human introns were classified computationally into low- and high-scoring PY tracts by scoring the likely U2AF65 binding site strength. Biochemical studies confirmed that low-scoring PY tracts are weak U2AF65 binding sites while high-scoring PY tracts are strong U2AF65 binding sites. A large population of human introns contains weak PY tracts. Computational analysis revealed many families of motifs, including C-rich and G-rich motifs, that are enriched upstream of weak PY tracts. In vivo splicing studies show that C-rich and G-rich motifs function as intronic splicing enhancers in a combinatorial manner to compensate for weak PY tracts.
The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests that a novel mechanism for intron recognition exists, which compensates for a weakened canonical pre-mRNA splicing motif.
The human thrombopoietin (TPO) gene, which codes for the principal
cytokine involved in platelet maturation, shows a peculiar alternative
splicing of its last exon, where an intra-exonic 116 nt alternative
intron is spliced out in a fraction of its mRNA. To characterize the
molecular mechanism underlying this alternative splicing, minigenes
of TPO genomic constructs with variable exon–intron configurations
or carrying exclusively the TPO cDNA were generated and transiently
transfected in the Hep3B cell line. We have found that the final
rate of the alternative intron splicing is determined by three elements:
the presence of upstream constitutive introns, the suboptimal splice
sites of the alternative intron and the length of the alternative
intron itself. Our results indicate that the recognition of suboptimal
intra-exonic splice junctions in the TPO gene is influenced by the assembly
of the spliceosome complex on constitutive introns and by a qualitative
scanning of the sequence by the transcriptional/splicing
machinery complex primed by upstream splicing signals.
We have shown previously that truncation of the human beta-globin pre-mRNA in the second exon, 14 nucleotides downstream from the 3' splice site, leads to inhibition of splicing but not cleavage at the 5' splice site. We now show that several nonglobin sequences substituted at this site can restore splicing and that the efficiency of splicing depends on the length of the second (downstream) exon and not a specific sequence. Deletions in the first exon have no effect on the efficiency of in vitro splicing. Surprisingly, an intron fragment from the 5' region of the human or rabbit beta-globin intron 2, when placed 14 nucleotides downstream from the 3' splice site, inhibited all the steps in splicing beginning with cleavage at the 5' splice site. This result suggests that the intron 2 fragment carries a "poison" sequence that can inhibit the splicing of an upstream intron.
Pre-mRNA splicing in plants, while generally similar to the processes in vertebrates and yeast, is thought to involve plant specific cis-acting elements. Both monocot and dicot introns are typically strongly enriched in U nucleotides, and AU- or U-rich segments are thought to be involved in intron recognition, splice site selection, and splicing efficiency. We have applied logitlinear models to find optimal combinations of splice site variables for the purpose of separating true splice sites from a large excess of potential sites. It is shown that plant splice site prediction from sequence inspection is greatly improved when compositional contrast between exons and introns is considered in addition to degree of matching to the splice site consensus (signal quality). The best model involves subclassification of splice sites according to the identity of the base immediately upstream of the GU and AG signals and gives substantial performance gains compared with conventional profile methods.
Exon 3 of the human apolipoprotein A-II (apoA-II) gene is efficiently included in the mRNA although its acceptor site is significantly weak because of a peculiar (GU)16 tract instead of a canonical polypyrimidine tract within the intron 2/exon 3 junction. Our previous studies demonstrated that the SR proteins ASF/SF2 and SC35 bind specifically an exonic splicing enhancer (ESE) within exon 3 and promote exon 3 splicing. In the present study, we show that the ESE is necessary only in the proper context. In addition, we have characterized two novel sequences in the flanking introns that modulate apoA-II exon 3 splicing. There is a G-rich element in intron 2 that interacts with hnRNPH1 and inhibits exon 3 splicing. The second is a purine rich region in intron 3 that binds SRp40 and SRp55 and promotes exon 3 inclusion in mRNA. We have also found that the (GU) repeats in the apoA-II context bind the splicing factor TDP-43 and interfere with exon 3 definition. Significantly, blocking of TDP-43 expression by small interfering RNA overrides the need for all the other cis-acting elements making exon 3 inclusion constitutive even in the presence of disrupted exonic and intronic enhancers. Altogether, our results suggest that exonic and intronic enhancers have evolved to balance the negative effects of the two silencers located in intron 2 and hence rescue the constitutive exon 3 inclusion in apoA-II mRNA.
Alternative splicing increases transcriptome and proteome diversification. Previous analyses aiming at comparing the rate of alternative splicing between different organisms provided contradicting results. These contradicting results were attributed to the fact that both analyses were dependent on the expressed sequence tag (EST) coverage, which varies greatly between the tested organisms. In this study we compare the level of alternative splicing among eight different organisms. By employing an EST independent approach we reveal that the percentage of genes and exons undergoing alternative splicing is higher in vertebrates compared with invertebrates. We also find that alternative exons of the skipping type are flanked by longer introns compared to constitutive ones, whereas alternative 5′ and 3′ splice sites events are generally not. In addition, although the regulation of alternative splicing and sizes of introns and exons have changed during metazoan evolution, intron retention remained the rarest type of alternative splicing, whereas exon skipping is more prevalent and exhibits a slight increase, from invertebrates to vertebrates. The difference in the level of alternative splicing suggests that alternative splicing may contribute greatly to the mammal higher level of phenotypic complexity, and that accumulation of introns confers an evolutionary advantage as it allows increasing the number of alternative splicing forms.
Splicing of small introns in lower eucaryotes can be distinguished from vertebrate splicing by the inability of such introns to be expanded and by the inability of splice site mutations to cause exon skipping-properties suggesting that the intron rather than the exon is the unit of recognition. Vertebrates do contain small introns. To see if they possess properties similar to small introns in lower eucaryotes, we studied the small second intron from the human alpha-globin gene. Mutation of the 5' splice site of this intron resulted in in vivo intron inclusion, not exon skipping, suggesting the presence of intron bridging interactions. The intron had an unusual base composition reflective of a sequence bias present in a collection of small human introns in which multiple G triplets stud the interior of the introns. Each G triplet represented a minimal sequence element additively contributing to maximal splicing efficiency and spliceosome assembly. More importantly, G triplets proximal to a duplicated splice site caused preferential utilization of the 5' splice site upstream of the triplets or the 3' splice site downstream of the triplets; i.e., sequences containing G triplets were preferentially used as introns when a choice was possible. Thus, G triplets internal to a small intron have the ability to affect splice site decisions at both ends of the intron. Each G triplet additively contributed to splice site selectivity. We suggest that G triplets are a common component of human 5' splice sites and aid in the definition of exon-intron borders as well as overall splicing efficiency. In addition, our data suggest that such intronic elements may be characteristic of small introns and represent an intronic equivalent to the exon enhancers that facilitate recognition of both ends of an exon during exon definition.
A guanosine to cytosine transversion at position 2 of the fifth intron of the mitochondrial gene COB blocks the ligation step of splicing. This mutation prevents the formation of a base pair within the P1 helix of this group I intron--the RNA duplex formed between the 3' end of the upstream exon and the internal guide sequence. The mutation also reduces the rate of the first step of splicing (guanosine addition at the 5' splice junction) while stimulating hydrolysis at the 3' intron-exon boundary. Consequently, the ligation of exons is blocked because the 3' exon is removed prior to cleavage at the 5' splice junction. The lesion can be suppressed by second-site mutations that preserve the potential for base-pairing at this position. Because the P1 duplex and the P10 duplex (between the guide sequence and the 3' exon) overlap at the affected pairings represent alternative structures that do not, indeed cannot, form simultaneously.
Adenosine to inosine editing of mRNA from the human 5-HT2C receptor gene (HTR2C) occurs at five exonic positions (A–E) in a stable stem–loop that includes the normal 5′ splice site of intron 5 and is flanked by two alternative splice sites. Using in vitro editing, we identified a novel editing site (F) located in the intronic part of the stem–loop and demonstrated editing at this site in human brain. We have shown that in cell culture, base substitutions to mimic editing at different combinations of the six sites profoundly affect relative splicing at the normal and the upstream alternative splice site, but splicing at the downstream alternative splice site was consistently rare. Editing combinations in different splice variants from human brain were determined and are consistent with the effects of editing on splicing observed in cell culture. As RNA editing usually occurs close to exon/intron boundaries, this is likely to be a general phenomenon and suggests an important novel role for RNA editing.
The control of RNA splicing is often modulated by exonic motifs near splice sites. Chief among these are exonic splice enhancers (ESEs). Well-described ESEs in mammals are purine rich and cause predictable skews in codon and amino acid usage toward exonic ends. Looking across species, those with relatively abundant intronic sequence are those with the more profound end of exon skews, indicative of exonization of splice site recognition. To date, the only intron-rich species that have been analyzed are mammals, precluding any conclusions about the likely ancestral condition. Here, we examine the patterns of codon and amino acid usage in the vicinity of exon–intron junctions in the brown alga Ectocarpus siliculosus, a species with abundant large introns, known SR proteins, and classical splice sites. We find that amino acids and codons preferred/avoided at both 3′ and 5′ ends in Ectocarpus, of which there are many, tend, on average, to also be preferred/avoided at the same exon ends in humans. Moreover, the preferences observed at the 5′ ends of exons are largely the same as those at the 3′ ends, a symmetry trend only previously observed in animals. We predict putative hexameric ESEs in Ectocarpus and show that these are purine rich and that there are many more of these identified as functional ESEs in humans than expected by chance. These results are consistent with deep phylogenetic conservation of SR protein binding motifs. Assuming codons preferred near boundaries are “splice optimal” codons, in Ectocarpus, unlike Drosophila, splice optimal and translationally optimal codons are not mutually exclusive. The exclusivity of translationally optimal and splice optimal codon sets is thus not universal.
ESE; Ectocarpus; splicing; translational selection
The relationship between polyadenylation and splicing was investigated in a model system consisting of two tandem but nonidentical polyomavirus late transcription units. This model system exploits the polyomavirus late transcription termination and polyadenylation signals, which are sufficiently weak to allow the production of many multigenome-length primary transcripts with repeating introns, exons, and poly(A) sites. This double-genome construct contains exons of two types, those bordered by 3' and 5' splice sites (L1 and L2) and those bordered by a 3' splice site and a poly(A) site (V1 and V2). The L1 and L2 exons are distinguishable from one another but retain identical flanking RNA processing signals, as is the case for the V1 and V2 exons. Analysis of cytoplasmic RNAs obtained from mouse cells transfected with this construct and its derivatives revealed the following. (i) V1 and V2 exons are often skipped during pre-mRNA processing, while L1 and L2 exons are not skipped. (ii) No messages contain internal, unused polyadenylation signals. (iii) Poly(A) site choice is not required for the selection of an upstream 3' splice site. (iv) When two tandem poly(A) sites are placed downstream of a 3' splice site, the first poly(A) site is chosen almost exclusively, even though transcription can proceed past both sites. (v) Placing a 3' splice site between these two tandem poly(A) sites allows the more distal site to be chosen. These and other available data are most consistent with a model in which terminal exons are produced by the coordinate selection and use of a 3' splice site with the nearest available downstream poly(A) site.
TG dinucleotides functioning as alternative 3' splice sites were identified and experimentally verified in 36 human genes.
Despite some degeneracy of sequence signals that govern splicing of eukaryotic pre-mRNAs, it is an accepted rule that U2-dependent introns exhibit the 3' terminal dinucleotide AG. Intrigued by anecdotal evidence for functional non-AG 3' splice sites, we carried out a human genome-wide screen.
We identified TG dinucleotides functioning as alternative 3' splice sites in 36 human genes. The TG-derived splice variants were experimentally validated with a success rate of 92%. Interestingly, ratios of alternative splice variants are tissue-specific for several introns. TG splice sites and their flanking intron sequences are substantially conserved between orthologous vertebrate genes, even between human and frog, indicating functional relevance. Remarkably, TG splice sites are exclusively found as alternative 3' splice sites, never as the sole 3' splice site for an intron, and we observed a distance constraint for TG-AG splice site tandems.
Since TGs splice sites are exclusively found as alternative 3' splice sites, the U2 spliceosome apparently accomplishes perfect specificity for 3' AGs at an early splicing step, but may choose 3' TGs during later steps. Given the tiny fraction of TG 3' splice sites compared to the vast amount of non-viable TGs, cis-acting sequence signals must significantly contribute to splice site definition. Thus, we consider TG-AG 3' splice site tandems as promising subjects for studies on the mechanisms of 3' splice site selection.
Exons are typically only 140 nt in length and are surrounded by intronic oceans that are thousands of nucleotides long. Four core splicing signals, aided by splicing-regulatory sequences (SRSs), direct the splicing machinery to the exon/intron junctions. Many different algorithms have been developed to identify and score the four splicing signals and thousands of putative SRSs have been identified, both computationally and experimentally. Here we describe SROOGLE, a webserver that makes splicing signal sequence and scoring data available to the biologist in an integrated, visual, easily interpretable, and user-friendly format. SROOGLE's input consists of the sequence of an exon and flanking introns. The graphic browser output displays the four core splicing signals with scores based on nine different algorithms and highlights sequences belonging to 13 different groups of SRSs. The interface also offers the ability to examine the effect of point mutations at any given position, as well a range of additional metrics and statistical measures regarding each potential signal. SROOGLE is available at http://sroogle.tau.ac.il, and may also be downloaded as a desktop version.
We describe a new program called cryptic splice finder (CSF) that can reliably identify cryptic splice sites (css), so providing a useful tool to help investigate splicing mutations in genetic disease. We report that many css are not entirely dormant and are often already active at low levels in normal genes prior to their enhancement in genetic disease. We also report a fascinating correlation between the positions of css and introns, whereby css within the exons of one species frequently match the exact position of introns in equivalent genes from another species. These results strongly indicate that many introns were inserted into css during evolution and they also imply that the splicing information that lies outside some introns can be independently recognized by the splicing machinery and was in place prior to intron insertion. This indicates that non-intronic splicing information had a key role in shaping the split structure of eukaryote genes.
Recognition of 5' splice points by group I and group II self-splicing introns involves the interaction of exon sequences--directly preceding the 5' splice site--with intronic sequence elements. We show here that the exon binding sequences (EBS) of group II intron aI5c can accept various substitutes of the authentic intron binding sites (IBS) provided in cis or in trans. The efficiency of cleavages at these cryptic 5' splice sites was enhanced by deletion of the authentic IBS2 element. All cryptic 5' cleavage sites studied here were preceded by an IBS1 like sequence; indicating that the IBS1/EBS1 pairing alone is sufficient for proper 5' splice site selection by the intronic EBS element. The results are discussed in terms of minimal requirements for 5' cleavages and position effects of IBS sites relative to the intron.
Mammalian genes are characterized by relatively small exons surrounded by variable lengths of intronic sequence. Sequences similar to the splice signals that define the 5′ and 3′ boundaries of these exons are also present in abundance throughout the surrounding introns. What causes the real sites to be distinguished from the multitude of pseudosites in pre-mRNA is unclear. Much progress has been made in defining additional sequence elements that enhance the use of particular sites. Less work has been done on sequences that repress the use of particular splice sites. To find additional examples of sequences that inhibit splicing, we searched human genomic DNA libraries for sequences that would inhibit the inclusion of a constitutively spliced exon. Genetic selection experiments suggested that such sequences were common, and we subsequently tested randomly chosen restriction fragments of about 100 bp. When inserted into the central exon of a three-exon minigene, about one in three inhibited inclusion, revealing a high frequency of inhibitory elements in human DNA. In contrast, only 1 in 27 Escherichia coli DNA fragments was inhibitory. Several previously identified silencing elements derived from alternatively spliced exons functioned weakly in this constitutively spliced exon. In contrast, a high-affinity site for U2AF65 strongly inhibited exon inclusion. Together, our results suggest that splicing occurs in a background of repression and, since many of our inhibitors contain splice like signals, we suggest that repression of some pseudosites may occur through an inhibitory arrangement of these sites.