Our experiments here revealed an unexpected function for U1 snRNP in protecting transcripts from PCPA in addition to and independent of its role in splicing. As a reference for U1 AMO, the general splicing inhibitor, SSA25
, which inactivates the U2 snRNP component SF3b5,26
, allowed identification of introns that were stable enough and accumulate to significantly detectable levels when their splicing was inhibited. As expected, the patterns observed for U1 AMO and SSA (which is similar to that of U2 AMO) showed that both efficiently inhibited splicing. However, U1 snRNP functional reduction had an additional and striking effect, resulting in the failure to produce full-length pre-mRNA from the majority of genes in our dataset. We showed that this was due to premature cleavage and polyadenylation from a cryptic PAS, typically in an intron and frequently within the first few kilobases (< 5 kb) from the start of RNA polymerase II transcripts. A non-splicing role for a snRNP has been previously shown for U2 snRNP in the 3′ end formation of histone mRNAs27,28
The mechanism by which U1 snRNP suppresses PCPA is not presently known. However, as it occurs from canonical PASs, it is reminiscent of previous observations on the capacity of tethered U1 snRNP to regulate normal 3′ end cleavage and polyadenylation from the natural PASs in the last exon29
, and may have features in common with it. For example, the U1 snRNP protein U1-70K can interact directly with the poly(A) polymerase (PAP)30,31
and inhibit polyadenylation. Targeting 5′-mutated U1 snRNAs with complementarity to sequences in the vicinity (within < 500 nt) of the natural PAS at 3′-terminal exon results in degradation of the transcript because cleavage occurs without addition of a poly(A) tail, leaving the transcript vulnerable to 3′ exonucleases32
. A considerable number of genes we surveyed, but were not included in our analysis, showed a decrease in exon signals or in both introns and exons throughout the transcript in U1 AMO-treated cells. It is possible that in these cases cleavage occurred without subsequent polyadenylation, and the transcript was therefore rapidly degraded. Alternatively, cleavage and polyadenylation may have occurred very close to the transcription start site, making these transcripts difficult to detect. These scenarios are nevertheless consistent with a role for U1 snRNP in suppressing cleavage and polyadenylation throughout the entire pre-mRNA by a similar machinery that until now was thought to only process the 3′ end of mRNA, in an even larger number of genes than our dataset presents.
Stochastically, canonical PASs (most frequently AAUAAA or AUUAAA) occur every 2,000 nucleotides, though in several of the genes we studied, including NR3C1, STK17A and BASP1, cryptic PASs are found every 500-800 nt. The strong 5′ bias with which PCPA occurred in these genes upon U1 snRNP functional knockdown suggests that one of the first few cryptic PASs is utilized. Up to the point at which PCPA occurred, these transcripts also contained many cryptic 5′ splice sites (Supplementary Figure 2
). We propose the following model to explain our observations. Pre-mRNA processing factors, including splicing factors, hnRNP proteins, snRNPs and 3′ end cleavage and polyadenylation factors co-transcriptionally associate with nascent transcripts33-37
. Direct association of cleavage/polyadenylation factors with the CTD of RNA pol II in the transcription elongation complex has been demonstrated36
. U1 snRNP associates with nascent transcripts, by base pairing with cognate sequences on the nascent pre-mRNA, including 5′ splice sites and cryptic 5′ splice sites and inhibits the cleavage/polyadenylation machinery from attacking the pre-mRNA at cryptic PASs. We envision that when U1 snRNP’s base pairing is prevented, as is the case in U1 AMO-transfected cells, cleavage and polyadenylation occurs co-transcriptionally at the first actionable PAS that the transcription elongation complex encounters. By actionable PAS, we mean one that has the necessary hexanucleotide consensus and is in an RNP context that makes it accessible and susceptible to attack by the cleavage/polyadenylation machinery unless U1 snRNP base paired in the vicinity is able to protect it. We suggest that under normal circumstances, this encounter happens after the last strong U1 binding site (5′ splice site or a cryptic 5′ splice site) in the 3′ UTR of the terminal exon because a sufficient density of U1 snRNP base paired throughout protects the transcript up to that point. The likelihood of normal or premature termination may be enhanced by the presence of pausing sites38
U1 snRNP bound to 5′ splice sites may thus serve a dual purpose – in splicing and suppression of PCPA. The perimeter of U1 snRNP’s protective zone is not known, but its binding to 5′ splice site alone is unlikely to be able to protect the majority of introns, which in humans average ~ 3.4 kb in length39
. Furthermore, if suppression of actionable PASs was provided only from U1 snRNP bound to 5′ splice sites, 5′ splice site mutations would be expected to cause premature termination, as opposed, for example, to exon skipping, which would be extremely deleterious and, to our knowledge, has not been observed. Additional U1 snRNP binding sites, including cryptic 5′ splice sites, may function as tethering sites for its activity in suppression of cleavage and polyadenylation in introns. Viewed from this perspective, sequences referred to as cryptic 5′ splice sites may serve a non-splicing purpose to recruit U1 snRNP to protect introns. It is also reasonable to consider that modulating U1 snRNP levels or its binding at sites that protect actionable PASs could be a mechanism for regulating gene expression, including down regulation of the mRNA or switching expression to a different mRNA produced from a prematurely terminated pre-mRNA. We suggest that the vulnerability to PCPA would be expected to increase with increasing intron size if U1 snRNP and cognate base-pairing sites are not available to protect it. We propose that the large excess of U1 snRNP over what is required for splicing in human cells serves an additional critical biological function, to suppress PCPA in introns and protect the integrity of the transcriptome.