The diversity of the human proteome is generated in large part by alternative splicing of pre-mRNAs transcribed from a limited number of genes (1
). Alternative splicing often results in the expression of different protein isoforms with diverse and even antagonistic activities. Of particular interest are alternative splicing events that are modulated according to developmental and cell-specific regulatory programs. Cis
-acting elements that mediate cell-specific regulation have been identified within several pre-mRNAs (4
) and some have been shown to bestow cell-specific regulation to heterologous exons (5
). To understand the mechanisms of cell-specific alternative splicing, it is necessary to identify and characterize the proteins that bind these elements.
An emerging family of proteins that regulate alternative splicing are the CUG-BP and ETR-3 like Factor (CELF) proteins [also known as Bruno-like (Brunol) proteins] (7
). The human genome contains six known CELF paralogs. While one protein, CUG-BP, is expressed widely, CELF3 and CELF5 are expressed only in brain, and CELF4, ETR-3 and CELF6 are expressed in a subset of tissues (7
). Individual CELF proteins have been shown to regulate splicing of human and chicken cardiac troponin T (cTNT) exon 5, human insulin receptor (IR) exon 11, human muscle-specific chloride channel (ClC-1) intron 2, rat NMDA receptor exons 5 and 21, and rat alpha-actinin mutually exclusive NM and SM exons (7
). The binding sites associated with regulated splicing have been identified for CUG-BP and ETR-3 and are typically U/G-rich motifs located within the introns adjacent to the regulated exons (12
). A neuron-specific CELF ortholog in Caenorhabditis elegans
, UNC-75, which is nearly 47% identical to human CELF4 and has a role in modulating neurotransmission, is proposed to regulate neuron-specific alternative splicing (18
). Human CELF4 can rescue an unc-75
null mutant phenotype and, like UNC-75, the rescuing human CELF4 protein localizes in nuclear speckles with splicing factors.
Chicken and human cTNT alternative exon 5 are well characterized targets of CELF regulation. cTNT undergoes developmentally regulated alternative splicing conserved in avian and mammalian species such that the exon is predominantly included in embryonic heart and skeletal muscle and is predominantly skipped in the adult (19
). For chicken cTNT, enhanced exon inclusion in embryonic striated muscle requires four muscle-specific splicing enhancers (MSEs) that are 40–45 nt in length and are located within the introns immediately surrounding exon 5 (5
). One MSE, MSE2, is sufficient for robust regulation of a heterologous exon in embryonic striated muscle when present in multiple copies located upstream and downstream of the alternative exon (5
). CUG-BP, ETR-3 and CELF4 have been shown to bind directly to U/G-rich motifs within MSEs 2 and 3 of chicken cTNT, and CUG-BP binds to a U/G-rich motif 19 nt downstream of human cTNT exon 5 (12
and data not shown). All six CELF proteins activate MSE-dependent exon inclusion when co-expressed with human or chicken cTNT minigenes in non-muscle cells (7
, T. Ho and T. Cooper, unpublished data). Point mutations within the U/G-rich motif of human cTNT that prevent binding of CUG-BP also prevent regulation by all six CELF proteins transiently expressed in vivo
as well as activation of exon inclusion in skeletal muscle cultures (15
, T. Ho and T. Cooper, unpublished data). Furthermore, exon inclusion of chicken cTNT exon 5 is induced by addition of recombinant ETR-3 to in vitro
splicing assays using HeLa nuclear extracts. Point mutations within the U/G motifs in MSEs 2 and 3 that prevent ETR-3 binding also prevent activation by ETR-3 (12
). These results demonstrate that CELF proteins bind to U/G motifs within cTNT MSEs and directly activate exon 5 inclusion.
CELF protein domain structure is similar to that of the Elav protein family containing two closely spaced RNA recognition motifs (RRM1 and RRM2) near the N-terminus, a 160–230 residue ‘divergent domain’ and a third RRM (RRM3) near the C-terminus (see Figs A and A). The CELF RRMs show a high degree of sequence identity among family members, however, there is little sequence identity in the divergent domain. All six genes express multiple isoforms due to alternative splicing generating variability within the N-termini, divergent domain and RRM3 of the protein, and within the mRNA 5′ untranslated region. The six CELF proteins can be separated into two groups based on sequence identity and functional differences. One group contains CUG-BP and ETR-3 which are 78% identical. The second group contains CELF3, CELF4, CELF5 and CELF6 which exhibit at most 43.8% identity to CUG-BP and 62–66% identity to each other. The two protein groups also differ in their ability to regulate splicing of an exon flanked upstream and downstream by three concatamerized copies (six copies total) of MSE2. CELFs 3–6 activate inclusion of this exon while CUG-BP and ETR-3 do not (7
). Both groups bind MSE2 suggesting that there are differences in either the ability to bind to concatamerized binding sites or in ‘post-binding’ events such as making appropriate protein–protein interactions that are required for splicing activation.
Figure 1 Splicing activity of human CELF4 deletion mutants. (A) Diagram of full-length human CELF4 protein showing RRM (dark gray) and conserved RNP2 and RNP1 motifs (black). Numbers above the diagram indicate N- and C-terminal positions of the RRMs. Horizontal (more ...)
Figure 4 Splicing activity of human ETR3 deletion mutants. (A, B) Deletion endpoints are comparable to those in CELF4 (Fig. A) based on alignment of ETR-3 and CELF4 proteins. The ETR-3 residue numbers are according to accession number AAK92699. (more ...)
We have defined the protein domains required for splicing activation in vivo by CELF4 and ETR-3, which were chosen as representatives of the two CELF subgroups. These analyses identified separate domains required for binding and splicing activation for both CELF4 and ETR-3. For CELF4, we found that RRM1 and RRM2 or RRM2 alone plus the adjacent 66 amino acids of the divergent domain activates splicing equivalent to full-length protein. Interestingly, non-overlapping N- and C-terminal segments of ETR-3 activate MSE-dependent exon inclusion, indicating that ETR-3 has functionally redundant segments at the N- and C-termini of the protein. Moreover, an inactive CELF4 deletion mutant lacking functional N-terminal RRMs that does not bind RNA inhibits the ability of active CELF4 proteins to activate exon inclusion in vivo. The dominant-negative activity of this protein is likely to result from its disruption of protein–protein interactions that are required for splicing activation.