Our previous analyses of the distribution of the FAS-hex3 ESS hexamers in introns adjacent to constitutive splice sites suggested a role for ESSs in splice site definition (Wang et al., 2004
). More detailed analysis of the distribution of these hexamers showed that a peak in ESS density occurs just 3′ of constitutive exons which have a strong downstream ‘decoy’ 5′ss (i.e. a sequence that matches the 5′ss consensus as well as or better than the authentic site), but that this peak is greatly reduced in the absence of decoy 5′ss (). A peak in ESS density also occurs just 5′ of the branch/3′ss region of exons which have a strong upstream decoy 3′ss, but again no pronounced peak was visible in the absence of a decoy 3′ss ().
ESSs generally influence splice site choice
ESSs Consistently Inhibit Usage of Intron-Proximal 5′ Splice Sites
These decoy-dependent peaks in ESS density suggested the hypothesis that ESSs function commonly as specificity determinants for splice site recognition by inhibiting the use of intron-proximal decoy 5′ss and 3′ss (i.e. inhibiting downstream decoy 5′ss and upstream decoy 3′ss). To directly test this idea, we constructed splicing reporter systems containing competing 5′ss or 3′ss (,Table S1
). Between competing 5′ss of similar strength, a diverse panel of ESSs was inserted, including: 9 decamers representing the 7 major groups obtained previously by clustering the ESS decamers based on sequence similarity (FAS-ESS groups A through G), and 2 unclustered FAS-ESS decamers; 6 FAS-hex3 hexamers representing consensus sequences of FAS-ESS groups; and 6 decamers designed to contain pairs of overlapping FAS-hex3 hexamers, as well as 6 randomly chosen control decamers (Tables S2, S3
). With control sequences inserted between the two 5′ss, ~30% of transcripts were spliced at the intron-proximal 5′ss (,Fig. S1A
,Tables S4, S5
). Strikingly, all but one of the 21 ESS elements tested significantly inhibited usage of the intron-proximal 5′ss relative to the distal site (). These results support the existence of a general rule for splicing regulatory element activity: that sequences capable of inhibiting exon inclusion from an exonic location generally have the ability to inhibit intron-proximal 5′ss when present between competing 5′ss.
Insertion of ESE sequences identified by the RESCUE-ESE method (Fairbrother et al., 2002
) between the competing 5′ss gave very different results. The 10 ESEs tested either had no detectable effect on splice site usage or, in 5 of 10 cases, enhanced usage of the intron-proximal 5′ss (Fig. S1B
,Tables S4, S5
). Thus, when they influenced 5′ss selection, ESEs tended to result in opposite effects of ESSs.
ESSs Consistently Inhibit Usage of Intron-Proximal 3′ Splice Sites
To ask whether 3′ss choice can be similarly controlled by known exonic splicing regulatory elements, a reporter containing competing 3′ss was constructed (). Insertion of control decamers between the competing 3′ss gave ~80% usage of the intron-proximal 3′ss (,Fig. S1C
). Using the same diverse panel of ESS decamers and hexamers as above, a strong and consistent effect on splice site selection was again observed, with 18 of 21 ESSs tested significantly inhibiting usage of the intron-proximal 3′ss relative to the distal site, often to a relative usage frequency below 25% (). These data support a generalization of the rule that ESSs inhibit intron-proximal splice sites to include 3′ss as well as 5′ss, with only rare exceptions (discussed in Supplemental Data).
When testing the same set of ten ESEs as above in the competing 3′ss reporter, none had a detectable effect on 3′ss choice (Fig. S1D
). Thus, generally speaking, ESSs had more pronounced effects on splice site selection than ESEs in this system.
Conservation and Activity of ESSs Located Between Alternative Splice Sites
The strong and consistent effects on splice site choice of ESSs inserted between competing splice sites suggested that ESSs might commonly play a role in regulation of natural alternative splice site exons. In accord with this idea, a significantly higher density of ESS hexamers was observed in the ‘extension’ regions between alternative 5′ss and between alternative 3′ss (yellow) than in the constant ‘core’ regions of such exons (blue), using large datasets of human alternative 5′ss exons (A5Es) and alternative 3′ss exons (A3Es) (). To further investigate this phenomenon, we analyzed the degree of conservation of ESS hexamers in the core and extension regions of orthologous human-mouse A5E and A3E exons (1074 and 1318 exon pairs, respectively). Consistent with common function, the number of aligned and conserved ESSs per 100 nt was far higher in extension regions of A5Es and A3Es than in the corresponding core regions (Fig. S2B
). Since the precise spacing of ESSs in exons may not be critical for their function (e.g., Han et al., 2005
), we also analyzed conservation using a statistic called Conserved Occurrence Rate (COR) which does not require that ESSs in human and mouse extension regions be perfectly aligned in order to be considered conserved (see Methods). Using the COR statistic, we observed that the conservation of FAS-hex3 ESS hexamers in orthologous human/mouse A5E and A3E extension regions was substantially higher than for control sets of hexamers in both classes of exons (P = 4e-12 and P = 5e-4, respectively, ). These results, comparing to control sets of hexamers with identical occurrence counts in A5E and A3E extension regions, indicate the presence of strong selective pressure to conserve occurrence of ESS sequences in these regions.
Role of ESSs in control of alternative splice site choice
To directly test the idea that ESS elements are commonly involved in regulation of splice site usage in human genes, splicing reporter minigenes were derived from natural A5E and A3E exons which had substantial EST coverage of both alternative isoforms (Supplemental Data). Exon 9 of the human AGER
(advanced glycosylation end product-specific receptor) gene has two alternative 5′ss separated by sequences containing overlapping FAS-hex3 ESS hexamers (). Compared to the wildtype exon, a two-base mutation that disrupted these ESSs resulted in a substantial increase in usage of the intron-proximal 5′ss (). Exon 3 of the human H2RSP
(HAI-2 related small protein) gene has two alternative 3′ss, again separated by multiple FAS-hex3 ESS hexamers, and also undergoes exon skipping. Mutation of the ESS hexamers located between the alternative 3′ss had two effects, reducing the level of exon skipping and also increasing the relative usage of the intron-proximal 3′ss (). Therefore, in this instance, ESSs function in regulation of both exon skipping and alternative 3′ss choice. These and similar results obtained for the IL17RE
gene (Fig. S3
) support the idea that ESSs located between alternative splice sites commonly regulate splice site usage in human genes.
The degree of conservation of ESE hexamers in the same sets of orthologous human-mouse A5E and A3E exons was also explored. The non-alignment-based COR method showed significant ESE conservation in the extension regions for both categories of exons (Fig. S2D
). In interpreting these results on ESE conservation it should be kept in mind that for both classes of alternative splice site exons, some of the ESE conservation observed in the extension regions may reflect selection for exon inclusion rather than for regulation of alternative splice site usage. Furthermore, previous studies have indicated that ESE activity may often depend on position relative to the regulated splice site (e.g., Fairbrother et al., 2004
; Graveley et al., 1998
). Therefore, alignment-based metrics such as the number of conserved ESEs per 100 nt may be more appropriate for studying ESEs. Using this measure, we observed a higher density of conserved ESEs in the extension regions of A5Es than in the core regions, but no significant difference in ESE density between the core and extension regions of A3Es (Fig. S2C
). These results mirror the experimental results described above which showed that ESEs have more pronounced effects on competing 5′ss than competing 3′ss (Fig. S1B, S1D
), and suggest that ESEs may be commonly used to regulate splice site usage in endogenous alternative 5′ss exons.
Tethering of hnRNP A1 and SF2/ASF Mimics Effects of ESSs and ESEs
Both ESEs and ESSs are thought to commonly function through binding to specific trans
-acting protein factors, whose level or activity may differ between cell types (Black, 2003
). To test whether the effects on splice site selection observed above could be explained by recruitment of canonical ESE- and ESS-associated factors, fusion proteins with the phage MS2 coat protein were used. The MS2 RNA hairpin, which is bound with high affinity by the MS2 coat protein, was inserted into the competing splice site reporter constructs in the location used to test candidate splicing regulatory elements, and cells were co-transfected with expression constructs for the MS2 coat protein fused to either the ESE-binding SR protein SF2/ASF, the ESS-binding hnRNP A1 protein, or the glycine-rich domain of A1 () (Del Gatto-Konczak et al., 1999
). Co-transfection of MS2-SF2/ASF with the competing 5′ss-MS2 hairpin reporter resulted in increased usage of the intron-proximal 5′ss (, lane 3) relative to mock-transfected controls. The MS2 fusions with either hnRNP A1 or its glycine-rich domain had the opposite effect, inhibiting use of the intron-proximal 5′ss (, lanes 5 and 7). That these shifts in splice site usage result from specific binding of the fusion proteins to the MS2 hairpin is supported by the observation that much smaller shifts were observed with the control MS2Δ reporter that contains a single nucleotide deletion in the MS2 hairpin which essentially abolishes binding to coat protein (lanes 4, 6, 8 of ). This mutation had no effect on splice site usage in the absence of fusion protein (lanes 1, 2 of ). Performing similar experiments using the competing 3′ss reporter, fusions with hnRNP A1 or its glycine-rich domain again strongly inhibited the intron-proximal splice site, but the SF2/ASF fusion protein had little or no effect on 3′ss usage in this system ().
Effects of tethering splicing factors between competing splice sites
The strong effects of the hnRNP A1 fusions on both 5′ss and 3′ss usage in the presence of the MS2 hairpin mirrored the effects of similarly positioned ESSs (). The effects of SF2/ASF fusion protein on 5′ss but not 3′ss usage also paralleled the differential effects on 5′ss and 3′ss usage observed for many ESEs (Fig. S1
). Thus, the effects on splice site selection mediated by many ESS and ESE elements could potentially be explained by direct recruitment of splicing factors of the hnRNP and SR protein families, respectively, and the effects of A1 on splice site choice are likely mediated through its glycine-rich domain.
Sequence Specificity of hnRNP and SR Protein Effects
The sequence-specific effects of the MS2 hairpin relative to the MS2Δ sequence were observed using ratios of fusion protein expression plasmid to splicing reporter plasmid of 0.05:1 and 0.1:1 for the competing 5′ss and 3′ss reporters, respectively. These ratios represent the lower ends of the ranges that gave robust splice site inhibition, and are substantially lower than those typically used in splicing factor/splicing reporter co-transfection experiments (Bai et al., 1999
). When the ratio of MS2-A1 expression plasmid to 5′ss reporter plasmid was increased by up to 12.5-fold, a steady increase in non-specific inhibition of the intron-proximal splice site was observed, and at high levels of expression plasmid similar results were observed for the MS2 and MS2Δ reporters (). Since the MS2Δ mutation essentially abolishes binding of MS2 coat protein (affinity reduced by > 3,000-fold) (Schneider et al., 1992
), these data suggest that hnRNP A1 protein may have non-sequence-specific as well as sequence-specific effects on 5′ss usage. Additional support for this somewhat surprising conclusion comes from observing the effects of different concentrations of the A1 glycine-rich domain-MS2 coat fusion protein on splice site choice (Figure S4C). These data show that sufficiently high levels of this fusion protein, which completely lacks the canonical RNA-binding portion of the A1 protein, can inhibit the intron-proximal 5′ss in the presence of the mutant MS2 binding site. A similar loss in sequence specificity was observed when the ratio of MS2-SF2/ASF expression plasmid to competing 5′ss reporter was increased over a range of 12.5-fold (Fig. S4
), suggesting that this factor may also have non-sequence-specific as well as sequence-specific effects on 5′ss usage.
It is well established that SF2/ASF and other SR proteins can promote the use of intron-proximal splice sites in several systems – either when added to in vitro
splicing reactions (Fu et al., 1992
; Ge and Manley, 1990
; Krainer et al., 1990
) or when over-expressed in vivo
(Bai et al., 1999
; Caceres et al., 1994
). Conversely, addition or over-expression of hnRNP A1 has been observed to antagonize the activity of SR proteins in several systems, often inhibiting splicing at intron-proximal sites (Bai et al., 1999
; Caceres et al., 1994
; Fu et al., 1992
; Mayeda and Krainer, 1992
). These observations led to a model in which the levels of SR proteins relative to hnRNP A1 could modulate the selection of alternative splice sites (Mayeda and Krainer, 1992
). However, these studies generally did not address the issue of sequence-specific binding of the target transcript by the splicing factors whose concentrations were manipulated.
Our results indicate that over-expression of SR or hnRNP splicing factors using typical expression protocols may often give rise to non-sequence-specific effects on splice site choice (,Fig. S4
), but that when splicing factors are expressed at more modest levels (as in , B), or are expressed at endogenous levels (as in and ), splice site selection is highly dependent on the sequences present between the competing splice sites. Thus, the non-specific effects observed with high levels of SR and hnRNP fusion protein expression constructs may simply be an artifact of over-expression. Taken together, the experimental and bioinformatic data described above support a model for splice site selection in which ESS sequences commonly mediate inhibition of intron-proximal decoy or alternative 5′ss or 3′ss, and ESEs often mediate an opposing effect on 5′ss selection. The set of ESSs and ESEs located between the splice sites could be used to tune the relative usage of a pair of alternative splice sites. Regulated splice site choice could then result from changes in the activities (expression levels, subcellular localization, modification state, etc.) of associated splicing regulatory proteins such as hnRNP and SR family proteins, each of which could potentially regulate a specific subset of A5E and A3E events depending on the precise ESS or ESE sequences present between the alternative splice sites. By contrast, if these factors simply promoted or inhibited use of all intron-proximal splice sites independent of sequence, such an activity would enable only a rather crude regulatory response to developmental or environmental cues. Thus, we propose that the region between alternative splice sites is critical for determining the level of usage of the alternative sites under different conditions or in different cell types.
Inhibition of Intron Retention by ESSs
Since ESSs can promote both exon skipping and alternative splice site usage, we next sought to determine whether ESSs can also regulate the remaining major class of alternative splicing events, intron retention. Though inserted into an intronic context in these experiments, we continue to refer to these elements as ESSs, keeping in mind that this designation implies nothing about their activity when located in an intronic context. For this purpose, we used a ‘multifunctional’ splicing reporter containing a pair of adjacent exons which are capable of splicing by multiple different pathways, leading to either: (i) inclusion of both exons with retention of the intervening intron (‘retained intron’ isoform); (ii) inclusion of both exons with splicing of the intervening intron (which we call the ‘fully spliced’ isoform because all pairs of splice sites are used); (iii) inclusion of the upstream exon with skipping of the downstream exon (‘single-skipped’ isoform); or (iv) exclusion of both exons (‘dual-skipped’ isoform). This combination of splicing events was found to occur in transcripts spanning exons 3 and 4 of the human NKIRAS2 (NFKB inhibitor interacting Ras-like 2) gene, from which a splicing reporter was derived (). The multiple pathways of splicing which this pair of exons can undergo afforded an opportunity to assess the relative effects of ESSs on exon skipping and intron retention.
Role of ESS in regulation of intron retention
The same diverse panel of ESS sequences and controls was inserted into the retained intron of the NKIRAS2
minigene, and splicing was assayed by transient transfection followed by quantitative RT-PCR to measure relative changes in spliced isoform levels (). The retained intron form was the most abundant isoform produced from both the native construct and following insertion with control sequences (,Fig. S5
). Notably, insertion of most ESSs substantially reduced the relative level of the retained intron form (, top band; the three exceptions were the same three sequences which failed to significantly affect 3′ss selection in ), suggesting that sequences capable of inhibiting exon inclusion from an exonic location may generally have the ability to inhibit intron retention from an intronic location. Consistent with this idea, it has recently been reported that intronic binding sites for certain hnRNP factors can stimulate the in vitro
splicing of pre-mRNAs containing artificially enlarged introns (Martinez-Contreras et al., 2006
Although reduction in the level of the retained intron isoform was observed nearly universally across the set of ESSs studied, different ESSs affected the other isoforms in different ways. While almost all of the ESSs studied increased the levels of the single-skipped form ( andFig. S5
), a scatter plot of the levels of the fully spliced and dual-skipped isoforms for the tested ESSs and controls indicated a curious mutually exclusive pattern (), suggesting the presence of two distinct classes of ESSs with distinct effects on splicing pathways. One group of ESSs, which we call Class 1, increased the dual-skipped form relative to the controls, with little or no effect on levels of the fully spliced isoform, while ESSs in the other group, which we call Class 2, increased the levels of the fully spliced form, with little or no change in the dual-skipped isoform (). Strikingly, no tested ESS significantly increased the levels of both of these isoforms or significantly reduced the level of either of these isoforms relative to the controls. The distinct effects of the two classes of ESSs in this reporter can potentially be explained by differing effects of the corresponding trans-
factors on splice site recognition and exon definition interactions (see Supplemental Data andFig. S6
Consistent with their distinct effects on splicing of the NKIRAS2
minigene, the sequences of Class 1 and Class 2 ESSs were generally quite distinct from each other (, right hand side). Class 1 sequences all contained either the tetramer TAGT or the pentanucleotide TAGGT, which resembles the binding motif for hnRNP A1 (Burd and Dreyfuss, 1994
). Class 2 sequences included most other ESSs, including a pyrimidine-rich group A sequence; an unclustered sequence; several G-rich and G/T-rich sequences containing GGTT and/or TGGG, many of which resemble the binding motif for hnRNPs F/H (Chen et al., 1999
); and sequences containing GTARGT (R = A or G), which is similar to the 5′ss consensus/GTRAGT. As described previously, this latter group of sequences, in addition to having ESS activity, can also be recognized as 5′ss (Wang et al., 2004
). The extra band that appears in above that for the fully-spliced isoform derives from use of these sequences in place of the normal 5′ss of the upstream exon.
The pronounced effects of ESSs on intron retention in the NKIRAS2 reporter suggested that ESSs might commonly be involved in cases of endogenous regulated intron retention. It is exceedingly rare to observe skipping of both of the exons flanking a retained intron in endogenous human genes: in a database of over 1200 retained introns, we observed only one clear-cut case of dual-exon skipping based on available transcript data, which was the NKIRAS2 gene, used in the reporter above. Thus, in a typical intron retention situation where isoforms that retain or splice out the intron are desired, but dual exon skipping is not desired, regulation of intron retention by Class 2 ESSs might be preferred, since Class 1 ESSs would tend to induce undesired dual exon skipping. To explore this idea, we analyzed the frequencies of Class 1 and Class 2 ESS hexamers in a large dataset of alternatively retained introns, in comparison to constitutively spliced introns. Consistent with the above model for regulation of intron retention, we observed that retained introns have significantly lower frequencies of Class 1 ESSs and significantly higher frequencies of Class 2 ESSs than controls, at both the 5′ and 3′ ends of the intron (). Thus, although additional studies will be required to establish the roles of ESSs in specific retained introns, the data presented in and suggest that the Class 2 subset of ESSs may play a common role in inhibiting intron retention in human genes.
Distribution of ESS hexamers in retained introns
Perspectives on the Context and Scope of ESS Activity in Control of Splicing
In many natural contexts, ESSs function cooperatively with each other or with other splicing regulatory elements, sometimes exhibiting highly complex interactions (Black, 2003
; Matlin et al., 2005
). For example, silencing of the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1
) transcript is mediated by a pair of exonic TAGG elements which function together with a GGGG motif that overlaps the 5′ss (Han et al., 2005
). Here, we have shown that a large and diverse set of ESS elements have consistent effects on splicing in a variety of different sequence contexts, including reporters based on human SIRT1
exon 6 and its flanking introns (, ), human AGER
exon 9 and H2RSP
exon 3 and their flanking introns (), and exons 3 and 4 and associated introns of human NKIRAS2
(). Thus, no specific flanking sequence context appears to be generally required for activity of most of these ESSs, although of course the magnitude of their effect on splicing is presumably modifiable by surrounding sequences and dependent on the expression of specific trans
-acting splicing factors.
The set of ESSs studied here was obtained using an in vivo
screen for sequences with the ability to inhibit inclusion of a constitutive exon. Our initial bioinformatic analyses of constitutive and alternative/skipped exons suggested that these ESSs are avoided in constitutive exons but may commonly be involved in control of exon skipping (Wang et al., 2004
), and regulation of exon skipping by ESSs with sequences related to several of the FAS-ESS groups has been described previously (Chen et al., 1999
; Del Gatto-Konczak et al., 1999
). The bioinformatic and experimental analyses described here provide evidence that these ESSs are commonly involved in splice site definition of constitutive exons, and in the regulation of alternative 5′ss usage, and in the regulation of alternative 3′ss usage, and in the regulation of intron retention. These diverse functions of ESSs are summarized in . Thus, it appears that ESSs play important roles in control of all of the common types of alternative splicing that affect human genes.
General roles of ESSs in alternative splicing