Partial genome tiling arrays previously showed that U1 protection is necessary to prevent drastic premature termination of the majority of nascent pol II transcripts by PCPA from cryptic PASs scattered throughout introns (
Kaida et al., 2010). We refer to this activity as telescripting, as it is necessary for nascent transcripts to extend over large distances. Here, a rapid and versatile high throughput strategy for identifying transcriptome changes, HIDE-seq, and experiments based on the wealth of information it provided, yielded a much greater definition of U1 telescripting, revealing it is not only an evolutionarily conserved measure for ensuring transcriptome integrity, but also a robust mechanism for gene expression regulation. Surprisingly, PCPA position in a given gene varied depending on the amount of available U1. While U1 depletion inhibited splicing and caused PCPA, typically in the first intron, moderate U1 decreases (10–50%) did not inhibit splicing, but caused PCPA farther downstream, trending towards greater distances from the TSS with lesser U1 decrease. Significantly fewer, but still numerous genes were affected by moderate U1 decrease compared to U1 depletion and it was non-destructive, producing shorter mRNAs due to usage of alternative, more promoter proximal PASs rather than the normal PAS at the 3′ end of the full-length gene. Thus, telescripting can be modulated by decreasing available U1 over a large range without splicing being compromised, consistent with the idea that there is an excess of U1 over what is required for splicing. We suggest that telescripting provides a novel global gene expression regulation mechanism.
Our data demonstrate that the predominant transcriptome changes resulting from incomplete telescripting are various forms of mRNA shortening, including 3′UTR shortening and PCPA in introns. Thus, telescripting plays a major role in determining mRNA length and isoform expression. Indeed, several mRNAs of different lengths can be produced from the same gene by PCPA, corresponding to the degree of U1 decrease (e.g., , GABPB1, UBAP2L). Importantly, PCPA can profoundly modulate protein expression levels and isoforms. While 3′ UTR shortening would not change the sequence of the encoded protein, it removes elements, such as microRNA- and hnRNP protein-binding sites, that could be critical for the mRNA’s regulation, including its stability, localization, and translational efficiency (
Filipowicz et al., 2008;
Huntzinger and Izaurralde, 2011). In contrast, PCPA in introns would produce an mRNA that encodes a protein lacking the C-terminus or containing a new C-terminal peptide, if the ORF extends into the terminal intron. An additional scenario associated with intronic PCPA is 3′ exon switching, also referred to as splicing-dependent APA, reflecting the general view that mechanistically, alternative splicing determines the PAS that is utilized. However, we showed that AMO masking of the alternative terminal exon’s PAS prevented its splicing, indicating that splicing into the alternative terminal exon depended on PCPA from this PAS (). Thus, PCPA can be the primary event in 3′ exon switching, revealing an unexpected splicing-independent role for U1 in alternative splicing regulation.
Several lines of evidence strongly suggest that U1 telescripting is a physiological phenomenon. First, over the entire range of U1 AMOs we used, numerous PCPA sites coincided precisely with previously detected polyadenylated transcripts, indicating that these cryptic sites are utilized naturally. These include ESTs representing short mRNA isoforms from a wide range of specimen (
Supplemental Figures S2 and S5), of which some have been noted as APA from proximal PASs (
Lou et al., 1996;
Pan et al., 2006;
Tian et al., 2007). Second, many of the mRNA shortening events resulting from loss of PCPA suppression are indistinguishable from the widespread mRNA shortening observed in activated T lymphocytes and neurons, proliferating cells and cancer cells (
Flavell et al., 2008;
Mayr and Bartel, 2009;
Sandberg et al., 2008;
Zhang et al., 2005). Notably, ~33% of the genes that undergo 3′ UTR shortening in activated T cells (
Sandberg et al., 2008) are similarly affected by low U1 AMO in HeLa cells (data not shown), as are RAB10 and CCND1 ( and
Supplemental Figure S6), seen in cancer cells (
Mayr and Bartel, 2009). Third, moderate U1 decrease recapitulated precisely hallmark switches to shorter isoforms in a neuronal activation model. Using native inducers (KCl and forskolin), we have shown that U1 decrease alone causes the same isoform switching in both
homer-1 and
Dab-1 in a dose-dependent manner. Representing the best-characterized example, the
homer-1 gene switches to an isoform lacking the C-terminal domain-encoding exons, which antagonizes the full-length protein’s critical activity in synapse strengthening and long-term potentiation (
Sala et al., 2003).
Having relied in our studies on deliberate U1 down-regulation, we considered whether there are physiological circumstances under which U1 levels could become deficient. Given U1’s abundance and very long half-life (Sauterer et al., 1988), it seemed unlikely that its levels would significantly decrease in absolute terms in the short timeframe in which mRNA shortening is observed. We considered an alternative scenario whereby U1 shortage relative to the targets it needs to protect in nascent transcripts could arise simply by an increase in transcriptional output of pre-mRNAs. Supporting this scenario, our measurements showed a rapid and transient increase in nascent transcripts of ~40–50% upon neuronal activation, while U1 levels showed little if any change (). This creates a significant U1 shortage relative to nascent pre-mRNAs, the magnitude of which is in the same range of the AMO experiments. This transcription-driven gap, detectable at 2–4 hr and returning to baseline at 6–8 hr after activation, creates a window of opportunity for APA to occur from proximal PASs due to the transient decrease in telescripting capacity. Importantly, the switch to shorter isoforms during neuronal activation could be antagonized by U1 over-expression in a dose-dependent manner. These data provide further evidence that U1 PCPA suppression is a built in PAS selection mechanism, and thus plays a major role in regulating gene expression during neuronal activation and potentially in response to other physiological stimuli.
The mechanism and factor(s) involved in the mRNA shortening in diverse activation conditions have not been identified (
Flavell et al., 2008;
Ji and Tian, 2009;
Mayr and Bartel, 2009;
Sandberg et al., 2008). However, the remarkable similarities they have with U1 shortage suggest that they also involve loss of U1 telescripting, at least during the initial (immediate/early) phase. Other factors have been described that could also cause a shift to proximal alternative PASs, particularly up-regulation of general polyadenylation and 3′ end processing factors, such as Cstf64 (
Chuvpilo et al., 1999;
Takagaki et al., 1996). However, this requires new protein synthesis and takes many (>18) hours (
Shell et al., 2005), and therefore cannot explain the rapid switch to proximal PAS, in contrast to U1 shortage, which is immediate upon stimulation and occurs even in the presence of protein synthesis inhibition (data not shown) (
Loebrich and Nedivi, 2009). The potential role of U1 and other factors at later times after cell activation and in other cells remains to be determined. Other means of creating U1 shortage, without transcription increase, can be envisioned, such as its sequestration in nuclear structures or by expression of other RNAs to which it could bind.
We propose a model for U1 telescripting that could explain its role in mRNA length regulation and isoform switching (). We suggest that PCPA occurs co-transcriptionally by the same machinery that carries out normal 3′ end cleavage and polyadenylation in the terminal exon of the full-length gene, and is a byproduct of the coupling between transcription and this downstream process (
Calvo and Manley, 2003;
Dantonel et al., 1997;
McCracken et al., 1997). CPA factors associate with the pol II TEC (
Glover-Cutter et al., 2008;
Hirose and Manley, 1998) close to the TSS and are therefore poised to process newly transcribed PASs with favorable sequence and structural features (actionable PASs) that it encounters throughout most of the length of the gene. This is normally prevented by U1 snRNP that binds to the nascent transcript. Previous studies have shown that U1 can inhibit polyadenylation of the normal PAS when tethered in its proximity in the terminal exon and
in vitro (
Ashe et al., 2000;
Fortes et al., 2003;
Gunderson et al., 1998;
Vagner et al., 2000). U1 is recruited to nascent pol II transcripts, including intronless transcripts, by multiple interactions with the pre-mRNA and RNA processing factors as well as the transcriptional machinery (
Brody et al., 2011;
Das et al., 2007;
Lewis et al., 1996;
Lutz et al., 1996). In all three organisms we studied, PCPA was typically not detected in the first few hundred nucleotides, rising sharply thereafter and peaking ~1 kb from the TSS (). It is possible that actionable PASs upstream of this point are not utilized because CPA factors may have not yet associated with the TEC (
Mayer et al., 2010;
Mueller et al., 2004), which depends on pol II’s carboxyl terminal domain phosphorylation state (
Buratowski, 2009). We suggest that this lag could serve (and may have evolved) to allow U1 binding before transcripts are exposed to CPA, to prevent their early destruction. Consistent with this distance from TSS consideration, PASs in the 5′UTR that are not PCPAed are functional when placed at the 3′ end of the gene (
Guo et al., 2011).
Our data also indicated that 5’ss bound U1 has a limited protective range of up to ~500–1000 nt and therefore would be insufficient to ensure telescripting through larger introns. Complete PCPA suppression could not be provided by U1 from the 5’ss alone even within this perimeter and plays almost no role in protecting more distal PASs, depending instead on additional U1 bound in introns (). Indeed, introns contain numerous U1 binding sites that do not function as 5’ss which we suggest serve to anchor U1 to protect introns from PCPA. Furthermore, we found that U1 with a mutated 5′ sequence that cannot function in splicing can still function in telescripting (), supporting the notion of two separate U1 roles. U1’s capacity to interact with nascent transcripts directly, even without base pairing (
Patel et al., 2007;
Spiluttini et al., 2010), enhances its association throughout the pre-mRNA, allowing it to scan the transcript and accelerate the rate with which it can find more stable base pairing sites, including the 5’ss. Importantly, PAS mutations cause PCPA from downstream intronic PASs (), suggesting a directional process, consistent with a co-transcriptional mechanism. We propose that U1 shortage causes transcript shortening because as co-transcriptional recruitment of U1 to nascent transcripts becomes limiting, it leaves distal PASs less protected, providing a built-in and U1 dose-dependent mechanism for mRNA length regulation and isoform switching.