We have termed the alternative isoform of CBP20 described in this report “CBP20S”, because it is shorter (103 amino acids) than the previously defined longer form (156 aa), which we term “CBP20L” for clarity. As shown in , CBP20S, arises through alternative 5′ and 3′ splice site choice during the excision of intron 2. CBP20S
mRNA is missing a large part of NCBP2
exon 2 and a few nucleotides of exon 3. There is no frame shift that would lead to either altered amino acid sequence or a PTC(s). Intriguingly, this alternative splicing event leads to the elimination of a large part of the RRM. CBP20L RRM domain comprises residues 41–114 (based on SMART smart.embl-heidelberg. de/) and residues 40–93 are excluded from CBP20S sequence. This part of the sequence corresponds to conserved RNP motifs 1 (residues 81–88) and 2 (residues 41–46).21
Furthermore, a TAT codon is formed at the new splice junction, introducing a tyrosine residue in the amino acid sequence (). These observations give rise to the following hypotheses. First, CBP20S could potentially have a regulatory function; for example, CBP20S could act as a dominant negative inhibitor of CBC function, if it interacts with CBP80 but fails to bind the m7
G cap. Second, CBP20S might still retain RNA-binding capacity. Although CBP20S lacks a part of the RRM, one of the key tyrosine residues involved in the cap binding is still present (Tyr 20) and a new tyrosine (Tyr 40) is introduced in a similar position to the deleted (Tyr 43). Third, it is possible that CBP20S function is not related to cap metabolism but to another cellular process. Finally, the short isoform could represent splicing ‘noise’ and have no functional significance.22–25
We addressed these possibilities by investigating CBP20S expression, localization and protein- and RNA-binding properties.
It was of interest to know whether CBP20S is expressed in human cells and how abundant it is compared to CBP20L. RT-PCR experiments showed that HeLa cells, A431 cells and primary cultures of human foreskin fibroblasts (HFF) expressed CBP20S (). From the dilution series of a PCR reaction, we estimate that CBP20S is approximately 20 times less abundant than CBP20L. Moreover, CBP20S was also detected in human bone marrow cells (). The shorter RNA species corresponded to CBP20S, as determined by sequencing the PCR products (data not shown). Furthermore, the UCSC Genome Bioinformatics database was searched for CBP20S expression sequence tags (ESTs). Several ESTs corresponding to the short isoform from different human tissues were found and are summarized in . Strikingly, the vast majority of CBP20S-expressing cells or tissues were cancer-derived, suggesting that CBP20S expression may be either a feature or consequence of tumorigenesis. 50% of the ESTs in the EST database come from cancer cells.26
Interestingly, all CBP20S ESTs were identical with no variability around the alternative 5′ or 3′ splice sites, suggesting that they do not result from an unspecific lack of splicing fidelity within these two exons.
Having established that CBP20S mRNA is present in various human cells, the question of CBP20S protein expression was addressed. CBP20L and CBP20S have predicted molecular weights of 18 and 12 kD, respectively. Western blot analysis of HeLa cell extracts showed the presence of a protein band with the same electrophoretic mobility as CBP20S, which was expressed from a plasmid as a marker (). The expression level of the putative CBP20S protein was lower than that of CBP20L, likely reflecting the mRNA ratio between the two isoforms. Thus, both CBP20S mRNA and protein can be detected in human cells.
Next, we sought to determine whether other species express CBP20S. A BLAST search was performed, using both RNA and protein sequences. Six other mammalian species expressing both isoforms were identified (Suppl. Fig. 1A
). The mRNA and protein sequences of CBP20S from all seven species were aligned. High levels of conservation were observed among the RNA sequence (Suppl. Fig. 1B
). The proteins were identical in human, chimpanzee, rhesus macaque, horse and pig, while mouse and short-tailed opossum had a few mismatches (Suppl. Fig. 1C
). 5′ and 3′ splice sites used in the CBP20S alternative splicing event were also aligned. These splice sites were similar and, with the exception of single mismatches in the case of Pan troglodytes
and Monodelphis domestica
, followed the general mammalian splice site profile (Suppl. Fig. 1D
). The extremely high degree of conservation of the CBP20S splicing event and protein product strongly suggests that this alternative form of CBP20 has an important function.
In order to gain insight into the function of CBP20S, both CBP20 isoforms were tagged with GFP and transiently expressed in HeLa cells. Microscopy analysis of GFP-tagged isoforms revealed similarities and differences in their cellular localization (). Both proteins were detected in nuclei, excluding nucleoli. However, unlike CBP20L, CBP20S was only moderately enriched in the nuclei and did not display a speckled distribution, characteristic for splicing factors. The nuclear localization pattern of both CBP20 isoforms clearly differed from that of GFP alone. The cytoplasmic distributions of CBP20L-GFP, CBP20S-GFP and GFP alone were indistinguishable, indicating no specific localization to any cytosolic compartment. These observations show that CBP20S protein is stable enough to accumulate at readily detectable levels and exhibits specific nucleoplasmic enrichment as well as a cytosolic pool.
Figure 2 Cellular localization of GFP-tagged CBP20 isoforms. HeLa cells were transiently transfected with CBP20L-GFP, CBP20S-GFP or GFP alone, fixed after 48 hours and imaged. Nuclei were stained with DAPI. Projections of z-stacks are shown. Scale bar, 10 µm. (more ...)
CBP20L is known to interact directly with CBP80 and the m7G cap present at the 5′ end of RNA polymerase II transcripts. It was of interest to know whether either of these characteristics is true for CBP20S. First, we tested whether CBP20S-GFP could co-immunoprecipitate CBP80 (). As expected, CBP20L pulled down CBP80. In contrast, CBP20S did not. We conclude from this that CBP20S, even when overexpressed, cannot assemble with CBP80 to produce an alternative form of the CBC. Second, the capacity of CBP20S to bind the m7G cap was tested directly in a cap-binding assay, in which HeLa cell extracts were incubated with 7-methyl-GTP (m7GTP) coupled to sepharose and the bound proteins analyzed by western blotting. shows that the m7GTP-sepharose specifically binds the cap-binding complex subunits CBP20L and CBP80 in extracts of untransfected HeLa cells, but not an unrelated protein, glyceraldehyde-3-phosphate dehydrogenase (GAPDH). In transfected cells, m7GTP-sepharose selected CBP20L-GFP as well as endogenous CBP80, representing the canonical CBC complex (). In contrast, m7GTP-sepharose failed to bind CBP20S-GFP and GFP alone. We conclude that CBP20S, unlike CBP20L, had no detectable m7GTP-binding capacity in this assay. These results indicate that CBP20S is not a component of the CBC and does not bind m7G.
Figure 3 CBP20S does not interact with CBP80. HeLa cells were transiently transfected with CBP20L-GFP, CBP20S-GFP or GFP alone and harvested after 48 hours. Immunoprecipitation was carried out from cell extracts with a-GFP antibody. The immunoprecipitated and (more ...)
Figure 4 CBP20S does not bind m7GTP. (A) Extracts of untransfected HeLa cells were incubated with Sepharose 4B or m7GTP-Sepharose 4B. The selected and input samples were analyzed by western blotting with α-CBP80, α-CBP20 and α-GAPDH antibodies. (more ...)
The results of the above m7GTP-sepharose binding assay, which require high affinity interactions, do not exclude the possibility that CBP20S associates with capped RNA polymerase II transcripts in vivo. Therefore, the binding of CBP20 isoforms to pre-snRNA and mRNA species was analyzed. First, GFP-specific antibodies were able to immunoprecipitate CBP20L-GFP together with metabolically labelled U1 and U2 pre-snRNAs, as expected (). In parallel, immunoprecipitation of labelled snRNAs with a monoclonal antibody (K121) specific for hypermethylated (m2,2,7G) caps shows the gel positions of U1 and U2 pre-snRNAs (not yet trimmed at their 3′ ends) and mature snRNAs. Neither pre-snRNA was detected in complexes pulled down with CBP20S-GFP. Second, the ability of CBP20S to associate with mRNAs was tested by immunoprecipitation of CBP20S-GFP or CBP20L-GFP from cell lysates, followed by RT-qPCR detection of transcripts encoding c-myc (MYC) and β-actin (ACTB). In the case of CBP20L-GFP, both MYC and ACTB mRNAs were detected ~30-fold above background, indicating strong association (). Interestingly, CBP20S-GFP immunoprecipitated low but significant levels of both mRNAs, which were detected ~4-fold above background. Thus, although CBP20S does not detectably interact with pre-snRNAs, it appears to associate with mRNA to a significant extent.
Figure 5 CBP20S pulls down mRNA. HeLa cells were transiently transfected with CBP20L-GFP, CBP20S-GFP or GFP alone and harvested after 48 hours. (A) 3-hour incubation with 32P-orthophosphate preceded cell extract preparation and immunoprecipitation. Cell extracts (more ...)
To pursue the detected association of CBP20S with mRNA, we examined CBP20S-GFP recruitment to a model transcription site consisting of a stably integrated gene array (similar to a previously published array27
). The transcription unit can be activated by doxycycline and located in the nuclear landscape, using a lactose repressor (LacI) fused to a fluorescent protein (). Interestingly, both CBP20L-GFP and CBP20S-GFP isoforms co-localized with RFP-LacI, indicating that both accumulated on the active transcription site (). No accumulation was observed under non-induced conditions (data not shown). To measure the dynamic properties of the different CBC components in the nucleoplasm, we photobleached the nucleoplasmic CBP signal and followed the recovery of fluorescence over time (fluorescence recovery after photobleaching, FRAP). These experiments revealed that CBP20L-GFP and CBP80-GFP show similar recovery kinetics in the nucleoplasm compared to CBP20S-GFP, which displayed more rapid recovery. This suggests that CBP20L and CBP80 associate in a discrete, larger complex, while CBP20S diffuses within the nucleoplasm either alone or in a smaller complex. When we examined the association kinetics of these components with the active transcription unit, once again we identified pronounced and similar residency of CBP20L and CBP80 on the active transcription site, indicative of specific binding to the nascent transcripts. On the other hand, CBP20S showed very rapid association and disassociation rates, implying short-lived binding. This confirmed our previous evidence that CBP20S is not present in the CBC ( and
), while it does have the ability to transiently associate with mRNA (). We conclude that, although CBP20S-GFP displays more dynamic mobility than CBP20L-GFP in living cells, it nevertheless accumulates specifically on an active RNA Polymerase II transcription unit.
Figure 6 CBP20L and CBP20S accumulate in vivo on an active transcription site. (A) The recruitment of CBP20L-GFP and CBP20S-GFP was monitored on an active transcription unit driven by a tetracycline-inducible promoter. The gene locus is visualized via the co-integration (more ...)
Because CBP20S-GFP associated with mRNAs and localized to an active transcription site, we postulated that CBP20S might interact with a protein(s) involved in RNA processing. Therefore, we searched for CBP20S interactors by immunopurification followed by mass spectrometry (). CBP80 was not detected, consistent with our data that CBP20S is not a component of the CBC. Seven factors involved in RNA processing could be identified in the CBP20S interactome, among them hnRNP A2/B1 and PTB-associated splicing factor (PSF). Because both of these proteins are known to relocalize to the nucleolus or nucleolar caps upon transcriptional arrest,28,29
we tested the behavior of CBP20L and CBP20S after transcription inhibition by Actinomycin D (ActD). As shown in , CBP20L-GFP and CBP20S-GFP relocalize to nucleolar caps after ActD treatment, adding them to the array of nucleoplasmic proteins that display this behavior. Both CBP20 isoforms segregated to PSF-positive nucleolar caps (termed dark nucleolar caps, DNC) but not to fibrillarin-positive caps (termed light nucleolar caps, LNC).29
Taken together, the data suggest that CBP20S has a specific function in nuclear RNA processing.
Proteins associated with CBP20S
Figure 7 CBP20L and CBP20S redistribute to nucleolar caps after actinomycin D treatment. U2OS cells were transfected with GFP-tagged CBP20 isoforms, incubated with 5 µg/ml actinomycin D for 2 hours and fixed. Cells were stained with α-PSF and α-fibrillarin (more ...)