|Home | About | Journals | Submit | Contact Us | Français|
The CELF family of RNA-binding proteins regulates many steps of mRNA metabolism. Although their best characterized function is in pre-mRNA splice site choice, CELF family members are also powerful modulators of mRNA decay. In this review we focus on the different modes of regulation that CELF proteins employ to mediate mRNA decay by binding to GU-rich elements. After starting with an overview of the importance of CELF proteins during development and disease pathogenesis, we then review the mRNA networks and cellular pathways these proteins regulate and the mechanisms by which they influence mRNA decay. Finally, we discuss how CELF protein activity is modulated during development and in response to cellular signals. We conclude by highlighting the priorities for new experiments in this field.
CELF (CUGBP, ELAV-Like Family) proteins have been called a plethora of names (BRUNO-like proteins, CUGBP proteins, CUGBP and ETR-3 like factors, EDEN-BP, NAPOR etc), because they were independently discovered and cloned by distinct research groups working with organisms ranging from humans to fruit flies. The inconsistent nomenclature has created much confusion; thus for the sake of simplicity we will primarily use the CELF acronym approved by the Human Genome Organization to describe this family of proteins. This acronym maintains the historical links to the prototypical and best studied member of the family, CELF1 (previously known as CUGBP1), and also highlights that the CELF proteins are related to ELAV (Embryonic Lethal Abnormal Vision) proteins (CELF2 was formerly known as ETR3, which stands for ELAV-Type RNA binding protein-3). ELAV factors form an evolutionarily conserved family of RNA-binding proteins that interact with U-rich sequences [1, 2].
The story of human CELF proteins began when CELF1/CUGBP1 was isolated as a factor that bound (CUG)8 RNA in vitro [3, 4]. As CUG-repeat expansion is the root cause of myotonic dystrophy type 1 (DM1)  this discovery spawned an avalanche of investigations into the role of CELF proteins in DM1 and in muscle. Many of these focused on CELF proteins as regulators of pre-mRNA splice site choice and collectively such studies demonstrated that altered CELF function is a major contributor to pathology in DM1 .
Just before human CELF1 was identified, the Drosophila CELF protein, Bruno, was determined to play an essential role as a translational regulator in early development  and a Xenopus homolog (EDEN-BP) was shown to regulate deadenylation . Since these early discoveries, research in each model system has tended to focus on one aspect of CELF protein function: splicing in mammals, translation in Drosophila and mRNA deadenylation in Xenopus. These processes are often coupled, however, and there is now substantial evidence that CELF proteins are multifunctional and regulate splicing, translation and mRNA decay in mammals. Moreover, these functions of CELF1 in modulating mRNA decay and translation are undoubtedly as important in development and disease as its role in pre-mRNA splicing.
In this review, after describing the biological effects of CELF-mediated regulation, we specifically examine the impact of CELF proteins on mRNA decay, a process that profoundly influences gene expression in all cells. For most mRNAs, decay initiates with removal of the poly(A) tail (deadenylation) and this rate-limiting step is modulated by CELF proteins through interaction with deadenylase enzymes. Below, we discuss how CELF proteins recognize their target transcripts in order to initiate coordinated and rapid changes in gene expression that occur during development and in response to external stimuli. Despite an overabundance of studies implicating CELF proteins as splicing regulators, emerging studies related to their role in controlling mRNA decay indicate that this function is likely just as essential.
CELF proteins are found in almost all eukaryotes with many organisms encoding multiple homologs. Six members of the CELF family have been identified in humans and mice: CELF1 (CUGBP1/NAB50/BRUNOL2) and CELF2 (CUGBP2/ETR3/BRUNOL3/NAPOR) proteins are expressed widely in many different tissues. CELF3, CELF4 and CELF5 proteins are restricted to adult tissues and found almost exclusively in the nervous system [9, 10]. CELF6 is expressed in kidney, brain and testes .
CELF proteins bear RNA Recognition Motifs (RRMs), responsible for binding to RNA, and are characterized by the arrangement of these domains as well as their conservation (Figure 1). Each CELF protein has two N-terminal RRMs followed by a linker and a third C-terminal RRM [12–15]. This structural arrangement is shared with the ELAV proteins which also play important roles in mRNA metabolism . However, beyond the arrangement of RRMs, the similarity between ELAV and CELF proteins is limited to the highly conserved residues that define the RRM motif (Figure 1). In contrast, CELF1 and CELF2 are virtually identical within their RNA-binding domains (>90% conserved) but the linker region is more divergent. At the sequence level, CELF3–6 are more closely related to each other than to CELF1 and CELF2, but homology with CELF1 is still readily detected. In addition to the three RNA-binding domains, other notable functional domains have been characterized including a C-terminal lysine/arginine-rich nuclear localization signal conserved in all CELF proteins, and a region with nuclear export activity found in the linker domain of CELF2  (Figure 1). The roles and regulation of these, and other features of these proteins, are discussed below.
In this section we will examine how CELF proteins mediate their wide-ranging effects by binding to diverse mRNA targets to modulate gene expression at a post-transcriptional level. We will focus specifically on mRNA decay, but it is important to note that this is just one of many processes that CELF proteins influence. Indeed, depending on where CELF proteins associate with their targets (5′UTR, introns, 3′UTR), and what other factors they interface with, they can impact polyadenylation, splicing, editing, localization, and translation as well as mRNA turnover. Although CELF protein functions outside of mRNA decay regulation must be glossed over here in the interests of brevity, they have been reviewed elsewhere [17–20]. In addition, studies investigating other aspects of mRNA metabolism that have provided insights relevant to CELF protein roles in mRNA decay regulation will be discussed.
An astounding number of studies have examined the binding preferences of CELF proteins and particularly CELF1 using in vitro techniques such as three-hybrid , Systematic Evolution of Ligands by Exponential enrichment (SELEX) [22, 23], Nuclear Magnetic Resonance (NMR) , Surface Plasmon Resonance (SPR) , Electrophoretic Mobility Shift Assay (EMSA) [23, 25], as well as RNA-Immunoprecipitation followed by microarray [26–29] or Cross-Linking Immunoprecipitation followed by sequencing [30, 31] (RIP-Chip/CLIP-Seq) in cultured cells. The findings are generally in agreement and clearly demonstrate that CELF1 and its homologs (EDEN-BP in Xenopus [29, 32], Bru  and Bru-3  in Drosophila, Danio Rerio Brul ), and mammalian CELF2  and CELF4 ) all recognize GU-rich RNA sequences with high affinity. The binding sites can take two forms; either a UG-repeat or a U-rich sequence interspersed with G nucleotides. The prominent similarity in binding preference amongst family members is perhaps not surprising given their high level of conservation. However, it is curious to note that CELF1 was first identified through an association with CUG repeat-containing RNA  and it has been reported to favor GC-rich sequences [36, 37]. CELF2 was also recently found to associate with the androgen receptor (AR) mRNA through the expanded CUG repeats that cause Spinal Bulbar Muscular Atrophy . These observations encourage further investigation as the affinity of recombinant CELF1 protein for CUG repeats is much lower (~100 fold) in vitro , than for UG-rich sequences and there is no evidence for significant enrichment of CUG sequences in RIP-seq analyses which are designed to entrap natural targets of RBPs [30, 31]. At this point, re-visiting the binding preferences following post-translational modification or in the presence of other RBPs would likely be very informative.
In addition to binding GREs, CELF1 and CELF2 have also been shown to associate with AU-rich elements (AREs) in TNF  and COX2/PTGS2 mRNAs, respectively . These interactions appear to be biologically significant as disrupting them alters stability of the target transcripts, but there is little evidence to suggest that AREs are a common recognition site for these proteins.
The sequence elements recognized by CELF proteins can differ somewhat depending on their context. CELF proteins regulate splicing in the nucleus, primarily through interaction with intronic sequences adjacent to alternatively spliced exons [22, 31, 41]. Intronic CELF binding sites closely resemble those found in 3′UTRs; both are generally GU-rich [30, 31]. In the cytoplasm, CELF proteins bind the 3′UTR to regulate mRNA decay  and translation [7, 40, 42, 43]. They have also been reported to associate with 5′UTR sequences to influence translation [44, 45]. To date only a few 5′UTR binding sites have been characterized. Most 5′UTR CELF1 binding sites appear to be more GC-rich [44, 46, 47], apart from that of the p27KIP1/CDKN1B mRNA, which is GU-rich and resembles the canonical CELF1 motif . It seems that the CELF protein must be post-translationally modified or cooperate with other RBPs to function at this end of the transcript.
As mentioned above, CELF proteins have three RRMs and each can interact with RNA. Unfortunately, it has not yet been possible to generate structure information for an intact CELF protein but, despite this, we can glean some important insights from studies of RRM1/2 and RRM3 fragments.
NMR solution studies demonstrated that RRM1, RRM2 [14, 15] and RRM3  each recognize (U)UGU(U) motifs. The tandem RRM1/2 domains together show increased affinity compared to the binding by each domain separately thus indicating binding cooperativity between the two RRMs [14, 15]. The divergent linker domain between RRM2 and RRM3 also appears to be important for RNA-binding since it was able to increase RNA-binding affinity, perhaps by conveying important conformational changes necessary for RNA-binding [21, 24, 48]. In support of this, recent studies of Drosophila Bruno RRM3 and CELF1 RRM3 show that both require an N-terminal extension in order to achieve high affinity interaction with RNA [13, 49]. Close examination of this region reveals a highly conserved seven amino acid stretch (Q(K/R)EGP(E/D)G) adjacent to RRM3 in all CELF proteins (Figure 1). This short motif is unique to CELF proteins and is not found in any other RNA-binding protein. Interestingly, RRM3 in conjunction with the N-terminal extension adopts an unusual conformation to interact with the UGU sequence . This may facilitate regulation of RNA-binding affinity through post-translational modification and/or protein-protein interactions. Overall, it seems likely that all three RRMs contribute to recognition of substrate RNAs, but their relative impact and/or binding preferences may change in response to cellular signals.
While the binding preferences of CELF proteins were being uncovered, it became clear that the GU-rich elements (GREs) they recognize are essential regulators of mRNA stability in mammalian cells [18, 50]. Genome-wide analyses of mRNA decay rates allowed isolation of mRNAs that exhibited rapid turnover, and computational methods were then used to search for conserved sequence elements in their 3′UTRs . In activated human T lymphocytes, transcripts that exhibited rapid mRNA decay were found to frequently contain one or more consensus “UGUUUGUUUGU” elements  or GU-repeats . These elements were recognized by CELF1 and conferred instability onto reporter mRNAs [25, 27]. Both GU-rich sequences and GU-repeats are also enriched in unstable myoblast mRNAs . It has since become clear that both types of GRE (GU-repeats and UGUUUGU sequences) recruit CELF1 to induce mRNA decay thus the GRE should perhaps be loosely defined as “UGUKUGU”.
Based on their overall U-richness, their location in the 3′UTR and their ability to modulate rates of mRNA decay, GREs bear significant similarity to the better known AU-rich element (ARE) found in the 3′UTR of many mRNAs encoding cytokines, transcription factors and growth factors. Indeed, with hindsight it seems likely that GREs may be contained within the Class III “non-AUUUA” AREs defined by Peng et al in 1996  which were originally found in the JUN and FOS 3′UTRs. Nonetheless, the GRE has several distinct functional characteristics that are summarized in Figure 2.
First, GREs may be less potent destabilizing elements than AREs, at least in some cell types. In Huh-7 cells, GUUUG-type GREs had a milder destabilizing effect on a reporter RNA than AREs . In addition, at least in T-cells, GRE-containing mRNAs generally have longer half-lives than ARE-containing transcripts . However, in C2C12 myoblasts, GREs were more closely associated with instability than AREs, suggesting that potency may vary in different cell types . This could reflect tissue specific differences in expression or activity of GRE-binding factors. Second, while AREs have additive effects on mRNA decay, the number of overlapping GUUUG pentamers in the GRE does not seem to correlate with mRNA decay rate. Third, the repertoire of proteins reported to interact with GREs (CELF1, CELF2, ELAVL1, TDP43, ELAVL4) is smaller than for AREs (HuR, AUF1, TIA1, TIAR, TTP, KSRP, CELF2, CELF1 and others) but there is considerable overlap that warrants further investigation. Finally, there is a clear propensity for mRNAs encoding cytokines, apoptosis factors and members of the NF-κB cascade to contain AREs rather than GREs. Nevertheless, several functional classes of mRNAs can contain either of these two types of instability element (e.g. cell proliferation, differentiation, transcription, signal transduction). Surprisingly, very few transcripts have both types of regulatory element. Perhaps the mRNA metabolism pathways that GREs and AREs influence are mutually exclusive; the two types of transcript might be localized differently or even be expressed in different cell types.
At this point it is clear that CELF1, and likely other CELF proteins, associate with GREs in the 3′UTRs of a large number of mRNAs, but what impact does this have on mRNA stability? For CELF1, simply tethering the protein to the 3′UTR of a reporter was sufficient to destabilize the mRNA . In addition, reporters bearing artificial GREs are destabilized following over-expression of CELF1 in COS-6 monkey kidney cells . This is very consistent with observations made in vitro that show CELF1 association with RNA correlates with enhanced deadenylation in both HeLa  and Xenopus [8, 55] cell-free systems. Further investigation uncovered a direct interaction between CELF1 and the PARN deadenylase  which nicely explains these findings and leads to a very simple model where binding of CELF1 allows recruitment of the PARN deadenylase and results in accelerated deadenylation (Figure 3A). This phenomenon has been characterized in mammalian cell extracts , and also occurs in Xenopus extracts. However, the impact, if any, of PARN/CELF1 collaboration is not well-characterized in living cells. Several groups have identified mRNAs associated with CELF1 and shown that at least some of these are subject to CELF1-dependent decay, but the role of PARN in this process is not really known. PARN does impact decay rates for some mRNAs , but so far too few PARN substrates have been identified to allow analysis of whether they are more likely to be bound by CELF1. Identification of PARN substrates has been hindered by the fact that PARN is just one of a number of deadenylase enzymes present in mammalian cells; other deadenylases may compensate when PARN function is depleted. In addition, CELF1 might recruit multiple deadenylases, or other as yet undefined components of the mRNA decay machinery, to enhance mRNA decay rates.
Despite strong evidence that both GREs and CELF1 act to enhance mRNA decay, it would be a huge over-simplification to state that CELF protein association always results in destabilization of substrate RNAs. On the contrary, CELF2 has clear stabilizing influences on several of its substrates, including COX2/PTGS2 and the AR mRNAs [38, 57]. This is quite surprising given that CELF1 and CELF2 are virtually identical at the amino acid level and thus would be expected to have comparable rather than opposing effects on mRNA decay. A clue as to how this might be explained comes from a recent study that showed CELF1 is required for stabilization of p21/CDKN1A mRNA during bortezomib-mediated apoptosis . Under these conditions CELF1 relocalizes to stress granules (SGs) and pulls the p21 mRNA along with it, where it is protected from mRNA decay enzymes and its translation is repressed. The conditions under which CELF2-induced mRNA stabilization has been documented (e.g. γ-irradiation [59–61], curcumin treatment ) induce apoptosis and therefore should also be considered stressful. It seems possible that the function of CELF2 in mRNA decay has fortuitously been studied predominantly under stress conditions where CELF2 acts to stabilize mRNAs by relocating them to SGs. To our knowledge this possibility has not been directly investigated, but it is interesting to note that curcumin also induces expression of the TIA1 RNA-binding protein which is essential for SG formation and for relocalization of mRNAs to stress granules .
Aside from CELF1 and CELF2, the other four mammalian CELF proteins have not been extensively studied with respect to their role in mRNA degradation. However, CELF4 associates with the 3′UTRs of up to 20% of mRNAs expressed in the murine hippocampus and cortex. Abundance, localization and translation of these transcripts are impacted by CELF4 knockout indicating that CELF4 at least has the potential to influence mRNA stability .
There is no clear homolog of PARN in the Drosophila genome and there’s little evidence so far that the Drosophila CELF protein, Bruno, modulates mRNA decay. Nevertheless, as mentioned previously Bruno is essential for normal oogenesis and early embryogenesis where it mediates translational repression of localized mRNAs such as Oskar [48, 63, 64]. Multiple mechanisms are involved, but one in particular has been linked with the eIF4E-binding protein Cup . Cup interacts with Bruno bound to the Osk 3′UTR and presumably prevents eIF4E from recruiting eIF4G to initiate translation . There is an interesting twist to this story however: Cup has the ability to recruit the CCR4 deadenylase and decapping activators to an mRNA substrate . As Bruno can bind Cup, this finding strongly implies that the Drosophila Bruno protein may indirectly engage deadenylases (Figure 3B). Further investigation will be required to ascertain whether this is indeed the case, but it is intriguing that the ability of CELF proteins to induce mRNA deadenylation may be conserved from flies up to man. As PARN has the ability to interact with, and be stimulated by, the 5′ cap structure [68, 69]; the fact that Cup has connections with both the cap (through eIF4E) and deadenylase activity (through CCR4) is intriguing.
CELF1 binds to a wide variety of mRNAs, but of what value is this to the cell? It is hard to estimate the global importance of CELF binding, but its main purpose is to allow coordinate control of mRNAs encoding factors required for an appropriate cellular response. This coordination is evident in a number of studies and occurs at multiple levels from splicing to translation, to mRNA decay. First, during T-cell activation a large number of GRE-containing mRNAs exhibit increased stability and abundance, many of them encoding factors important for cell proliferation, apoptosis and activation. Second, in Xenopus oocytes, CELF1 works with PARN to coordinate the deadenylation of GRE-containing mRNAs immediately following fertilization. The substrate mRNAs are predominantly maternally deposited transcripts within the oocyte that must be translationally silenced after fertilization and eventually degraded. The process is initiated by calcium-dependent dephosphorylation of CELF1, but the effect of this event on CELF1 interactions with RNA or other proteins remains to be deciphered. Third, there is also evidence to suggest that CELF1 coordinates important changes in gene expression during muscle development, although the targets are not well defined. In C2C12 cells, CELF1 is associated with a number of mRNAs that encode factors important for myogenesis  and as such is likely to influence the dramatic changes that occur in the abundance of some of these transcripts as cells differentiate. In addition, down-regulation of CELF1 has been linked with the global switch from embryonic to adult splicing patterns that occur in neonatal murine hearts . This down-regulation would also be expected to influence stability of CELF-associated transcripts. Given the close ties between CELF1 and myotonic dystrophy, it will be imperative to fully characterize changes in stability of CELF associated transcripts that occur during myogenesis and in DM patient cells.
The wide-ranging impact of CELF1 on networks of mRNAs is a property that is almost certainly shared with other CELF proteins. CELF4, for example, binds to over 2000 mRNAs in mouse brain tissue, many of which encode factors involved in the regulation of synaptic function. As such, it may be intimately involved in coordinating changes in protein synthesis and/or mRNA decay in response to synaptic stimuli .
CELF proteins do not serve mere housekeeping functions; they are essential to many of the changes in gene expression that occur during development and in response to extracellular stimuli. As such, CELF proteins exhibit regulation at multiple levels including abundance, localization, and RNA-binding affinity.
During embryonic and early post-natal development gene expression must be extensively reprogrammed as cells differentiate into specific tissues. CELF proteins undoubtedly play a role in this reprogramming by influencing splice site choice and presumably mRNA decay rates and translation as well. Changes in the abundance of CELF proteins that favor certain splice events have been well documented. For example, expression of CELF1 and CELF2 proteins decreases dramatically in the murine heart after birth and this correlates with a global change from embryonic to adult splicing patterns . Abundance of both CELF1 and CELF2 also changes during myogenesis in C2C12 cells with comparable effects on splice site choice .
In heart, the developmental changes in CELF protein abundance have been attributed to miRNA-mediated regulation as there is little change in the amount of the CELF mRNAs. The two miR-23 family miRNAs (miR-23a, miR-23b) have a reciprocal expression pattern to CELF1 and CELF2 proteins; miRNA abundance increases as the heart becomes more mature. Moreover, inhibition of miR-23a function resulted in elevated expression of CELF proteins in vivo . CELF2 is also targeted by miR-144/miR-451 in a pathway that influences cardiomyocyte survival .
CELF protein abundance is modulated during certain cellular responses. Polyamine depletion increases abundance of CELF1 protein in intestinal epithelial cells by down-regulating miR-503, a miRNA that targets the ORF of CELF1 to repress translation . CELF2 expression is induced in colon cancer cells in response to ionizing radiation but in this case regulation is likely to be at the level of transcription or mRNA decay as the abundance of CELF2 mRNA is increased [60, 61]. Curiously, this response can be prevented by pre-treatment with lipopolysaccharide which completely blocks induction of CELF2 expression . Finally, CELF1 expression in pre-adipocytes increases with age. This increase also occurs at the mRNA level and may be linked with the increased abundance of TNF in aging fat tissue as treatment of pre-adipocytes with either TNF or LPS induces CELF1 expression . A great deal more work is needed to uncover the mechanisms that regulate CELF expression at the transcriptional and post-transcriptional levels.
Post-translational modifications are known to modulate the activity of a wide range of RNA-binding proteins and CELF proteins are no exception. CELF1 and CELF2 have quite extensive Ser/Thr-rich segments within their linker domain that could represent phosphorylation sites (Figure 4). Phosphorylation of CELF1 has been detected at Ser28 (Akt kinase) and Ser302 (Cyclin D/cdk4/6) and in each case these modifications increase the affinity of the protein for certain mRNA substrates  and affect the ability of CELF1 to interact with translation factors . CELF1 also becomes phosphorylated during T-cell activation. In this case the site of phosphorylation is not known, but it results in lower affinity of CELF1 for GRE-containing target transcripts . As a result, many of these transcripts are more stable following activation. CELF2 RNA-binding can also be modulated through phosphorylation; when rat smooth muscle cells are treated with PDGF, C-SRC tyrosine kinase is activated and phosphorylates CELF2 on Tyr49 . This modification enhances association of CELF2 with COX2/PTGS2 mRNA and results in mRNA stabilization.
Hyperphosphorylation of CELF1 has been reported as a major contributor to pathogenesis in myotonic dystrophy [79, 80]. Although the specific site(s) are again not known, phosphorylation requires Protein Kinase C (PKC) and increases stability of CELF1 protein leading to its over-expression . PKC inhibitors can reverse the symptoms exhibited by mouse models of DM1 presumably by limiting CELF1 phosphorylation . It is not known whether the RNA-binding preferences of CELF1 are affected by hyperphosphorylation, but splicing patterns are consistent with increased CELF1 activity in DM1 patient cells and mouse models of the disease . This is expected given the increased accumulation of the CELF1 protein in the nucleus. The effects on cytoplasmic functions, such as mRNA decay, await further characterization.
In Xenopus oocytes, in order for CELF1/EDEN-BP to activate deadenylation of its maternal mRNAs following fertilization it must undergo calcium-dependent dephosphorylation . It is not known whether activation involves enhanced RNA-binding or improved CELF1 interaction with the deadenylase. Again, characterization of the precise sites of phosphorylation is needed to shed light on this phenomenon.
Finally, acetylation of Lys436 has been reported for CELF1 (as well as the analogous residue in CELF2) [84, 85]. The role of this particular modification has been explored only superficially for cytoplasmic proteins. However, as this lysine lies within RRM3 it seems very possible that it could impact RNA-binding affinity or protein-protein interactions. Interestingly, similar modifications were recently reported for the cytoplasmic poly(A)-binding protein where they potentially influence many aspects of mRNA metabolism including deadenylation and translation .
One of the major priorities must be to identify sites of modification and characterize their impact on CELF protein function. Many functions, from RNA-binding preferences and protein-protein interactions, to localization and decay activity, may be extensively impacted. Dramatic changes in substrate preference following modification might explain why unmodified recombinant CELF1 prefers GU-rich sequences while some studies report affinity for CUG- and GC-rich sequences in cell extracts. Such information could lead to exciting new avenues for therapeutics for DM1 patients.
In order for an RNA-binding protein to influence decay of its substrate mRNAs, the two must be able to come together and also to interact with effector factors such as the mRNA decay enzymes. Thus one of the main ways that activity can be regulated is through restricting the RBP to certain subcellular compartments. Like many RBPs, in most cells CELF1 is found in the nucleus where it modulates splicing and in the cytoplasm where it influences translation and mRNA decay. However, the distribution of CELF1 can vary in disease or in response to stress and this almost certainly impacts its function.
Various environmental stresses, such as heat shock and viral infection, converge on the translation initiation factor eIF2α . Phosphorylation of eIF2α inhibits translation and induces the aggregation of mRNAs and RNA-binding proteins into cytoplasmic domains known as stress granules (SGs) . From there, mRNAs are stored and either eventually returned to the cytoplasm for translation, or targeted for mRNA decay . The TIA1 and TIAR RNA-binding proteins are essential components of SGs and their C-terminal glutamine/asparagine- rich domains are vital for SG formation . During stress, cytoplasmic CELF1 relocalizes to SGs, likely through a direct interaction with TIA1 . As an example, the relocalization of CELF1 during heat shock is shown in Figure 5. Importantly, although CELF1 is not required for SG formation, it is required for relocalization of the p21/CDKN1A mRNA to SGs, and also for stabilization of p21 mRNA during stress . It seems likely that other CELF1 mRNA targets experience this type of regulation too. Importantly, as SGs can be induced under cellular conditions that mimic DM1, and they have been observed in DM1 patient myoblasts , relocalization of CELF1 to SGs may result in changes in mRNA translation and decay that could play a role in DM pathogenesis. In support of this, CELF1-mediated translation is impacted in DM1 myoblasts . Other CELF proteins may also be relocalized during stress, but this has not been investigated to date.
CELF proteins have also been found in other types of RNA granules. In particular, CELF4 is present in large neuronal RNA granules of as yet unknown function , and the Drosophila Bruno protein participates in aggregating mRNAs into silencing particles that are required for spatially restricted expression of its target transcripts . It seems likely that sequestration of mRNAs into such aggregates would restrict access of mRNA decay enzymes and alter mRNA decay rates.
While CELF1 redistribution can be a natural and normal function of this protein, there are occasions where CELF1 mislocalization may be pathogenic. In cancer cells, CELF1 can be found in a specialized region of the cell termed the perinucleolar compartment (PNC)  where it is associated with the RNA component of RNAse MRP (mitochondrial RNA-processing) and PTBP (polypyrimidine tract binding protein) . If CELF1 is sequestered within the PNC, its ability to perform its normal cellular functions, including cytoplasmic mRNA decay, would obviously be impaired. The function of the PNC remains obscure, but as its presence correlates closely with metastasis and poor prognosis, the localization of CELF1 to this region will be a high priority for further investigation. Sequestration of CELF1 may also contribute to pathogenesis in oculopharyngeal muscular dystrophy (OPMD), as it accumulates in the nuclear inclusions that form in this disease .
Finally, although CELF1 and 2 usually are distributed in both the nucleus and cytoplasm, under certain conditions cytoplasmic abundance may be elevated. This is the case for CELF2 following γ-irradiation, where the majority of the protein becomes cytoplasmic . This change correlates with increased expression of CELF2 splice isoforms that localize to the cytoplasm . Increased cytoplasmic accumulation is a common response for many RBPs (reviewed in [97, 98] and almost certainly influences the decay of mRNAs targeted by these proteins.
Although one might imagine that six different genes encoding CELF proteins would be sufficient, there is evidence for multiple isoforms of each of these six proteins. Most of the CELF genes are encode multiple variant transcripts through use of different promoters and/or alternative mRNA processing (according to the UCSC genome browser, http://genome.ucsc.edu/). Different transcripts can encode different protein isoforms or allow for tissue-specific or developmental regulation of expression. CELF2, for example, has three promoters that generate three different isoforms each of which has a unique N-terminal sequence and 5′UTR . The encoded proteins are expressed differently with the shortest isoform being preferentially induced in response to ionizing radiation and more highly expressed in colon cancer cells [96, 99]. Notably, the N-terminally extended isoforms are more cytoplasmic and thus may impact decay of mRNA targets. It remains to be seen whether the altered localization might be achieved through a nuclear export signal encoded only in the extended isoforms.
NCBI RefSeq Gene records show that CELF2 and CELF5 also have alternative 3′UTRs and C termini generated by alternative splicing which could lead to differences in regulation and protein function as well . Finally, both CELF2 and CELF4 have an alternatively spliced exon that encodes the N terminal of the third RRM (exon 14 for CELF2 , exon 9 for CELF4). For CELF2, skipping of this exon occurs to different extents in different tissues and the loss of the N-terminal part of RRM3 prevents this domain from binding RNA. Moreover, the CELF2 isoform lacking exon 14 has altered function with regards to its ability to modulate splice site choice . This may result either from altered RNA-binding affinity or from effects on protein-protein interactions. Further studies are needed to assess the global influence of different CELF protein isoforms in various tissues and at different developmental stages.
The human and Xenopus CELF1 proteins and Drosophila Bru-3 bind to RNA as dimers [32, 33] and may require GU-rich sequences of sufficient length to allow dimer formation. Surrounding U-rich sequences may be necessary for assembly of CELF proteins on RNA by allowing optimal secondary structure to facilitate the formation of RNA-protein complexes. Interestingly, at least for Xenopus CELF1 (EDEN-BP), the ability to interact with other CELF1 molecules is essential for RNA-binding and for induction of deadenylation. The N-terminal region of the linker domain is required for this interaction .
In addition to being required for RNA-binding, CELF-CELF interactions can facilitate mRNP formation in other ways. Oligomerization of the Drosophila CELF protein, Bruno, can lead to formation of large silencing particles that prevent access of ribosomes . These large RNPs perhaps facilitate translational silencing and transport of mRNAs. It is not clear whether this type of function is conserved among CELF proteins in higher eukaryotes.
The fact that CELF1 can dimerize with itself gives rise to the possibility that heterodimers between different CELF proteins also exist. Such heterodimers might have different binding specificities or functions. In support of this idea, a dominant negative CELF4 (DNCELF4Δ) protein lacking all of RRM1 and part of RRM2 is able to interfere with the splicing activity of wild-type CELF2 and CELF4 in cultured cells . Similar effects on splicing are seen when this mutant is expressed in either heart or skeletal muscle. DNCELF4Δ expression in the heart causes dilated cardiomyopathy and cardiac hypertrophy  while expression in skeletal muscle has milder effects on muscle organization and fiber size . As the DNCELF4Δ mutant does not bind RNA , the effect is most likely mediated through direct interactions between the mutant CELF4 protein and other CELF proteins. One implication of this is that sequestration or inactivation of one CELF protein could impact the function of other CELF proteins by influencing the availability of heterodimers. To date, this is a virtually unexplored area of CELF protein function, especially with respect to effects on mRNA decay.
Each mRNP is a complex and ephemeral entity that can be rapidly remodeled in response to changing conditions. As such it is important to consider the influence of other RNA-binding proteins (and RNAs) on CELF function. Such factors can hinder, or help RNA binding either through steric mechanisms or by altering the sub-cellular location of the transcript to make it more or less accessible to CELF.
Muscleblind (MBNL) proteins have an extremely intimate and wide-ranging relationship with CELF proteins. They are zinc finger-type RBPs that, like CELF proteins, have been strongly implicated as one of the main cellular proteins impacted in myotonic dystrophy [106, 107]. Although CELF and MBNL proteins have distinct binding preferences (MBNL1 prefers YCGY ) they associate with many of the same mRNAs to regulate splicing. Evidence to date indicates that these factors oppose each other with CELF binding favoring one splice pattern and MBNL encouraging the opposite . Recent studies have shown that MBNL1 is also associated with a large number of mRNA 3′UTRs in C2C12 cells and many substrates are shared with CELF1 . MBNL1 binding to the 3′UTR has multiple effects on mRNA metabolism as both mRNA stability and subcellular localization can be altered [31, 109]. Interestingly, both MBNL1 and CELF1 appear to favor destabilization of their substrates, but more work is required to determine whether they have additive effects when both associate with the same substrate mRNAs. It also is not known whether MBNL1 affects deadenylation rates.
One of the most effective ways to block binding of CELF proteins is through competition by another RBP that recognizes the same sequence elements. Many RBPs have similar binding preferences including several of those that recognize AREs (e.g. TIA1, AUF1/HNRNPD, HuR/ELAVL1, HuD/ELAVL4) (Figure 2). Of note, HuR and RBM38 (RNPC1) both favor U-rich sequences that are deficient in cytosine residues, which have obvious overlap with the UGUUUGU type CELF binding sites. Moreover, CELF1, HuR and RBM38 are all associated with CDKN1A/p21 mRNA [26, 110] and HuR and RBM38 are essential for myogenesis [111, 112], a process in which CELF1 has been clearly implicated . In fact, there is significant overlap between the sets of transcripts bound by HuR and CELF1 . In addition, CELF2 has been shown to compete with HuR for an AU-rich binding site on the COX2 mRNA leading to translational silencing .
The TDP43/TARDBP RNA binding protein is also worth a mention as it binds avidly to long UG repeats in introns to regulate splicing, and also to UG-rich regions in 3′UTRs to enhance mRNA decay, much as CELF proteins do . As an example, both CELF2 and TDP43 can bind to a UG-rich region in intron 8 of the CFTR1 transcript to promote exon skipping . More analysis is required to determine whether CELF proteins and TDP43 actually share RNA substrates, but if they do, CELF protein function may be altered in diseases where TDP43 is mutated, such as in Amyotrophic Lateral Sclerosis.
It is likely that CELF proteins also influence miRNA function as has been seen for HuR , RBM38  and several other RBPs. This could be achieved through direct competition for a shared binding site, or through remodeling of the mRNA structure to favor (or impede) miRNA association nearby. In support of this, a recent analysis determined that UUUGUUU motifs, which bear an uncanny resemblance to CELF binding sites, are enriched adjacent to miRNA binding sites and their presence tends to augment miRNA activity . Such cooperation between miRNAs and CELF proteins could potentiate the ability of these factors to induce rapid mRNA decay and translational repression.
Any miRNA that contain a UGUKUGU seed sequence could in theory bind mRNA and occlude its association with CELF proteins. Although this has not been investigated, miR-495 is worth mentioning as it has perfect seed alignment to the GUUUGUU motif. This miRNA is differentially expressed in cancer cells [120, 121] and also in several inherited muscular disorders, including DM . We note that several validated targets of miR-495, including REDD1 [26, 120], are also associated with CELF1 by co- immunoprecipitation .
In addition, CELF proteins could potentially indirectly influence mRNA decay rates by controlling miRNA processing. Other RNA-binding proteins, such as TDP43 , and KSRP , have been shown to associate with specific miRNAs to modulate their processing efficiency and thereby influence their activity.
Interplay between CELF proteins and miRNAs is essentially unstudied to date, but should perhaps be a high priority, given recent observations that miRNA expression and/or processing are affected in myotonic dystrophy [125–127].
Mammalian CELF proteins are expressed throughout development and have been linked with a variety of human conditions including cancer and neuromuscular disease. Many of the developmental roles of CELF proteins are conserved in lower organisms, thus models ranging from Drosophila to Danio rerio have been informative with regards to the biological significance of this family of proteins.
Much of the attention focused on CELF proteins today was fueled by the finding that CELF1 is overexpressed in DM1, an inherited neuromuscular disease affecting 1 in 8000 adults (, reviewed in [6, 128]). Recent studies have determined that CELF1 overexpression in mice is sufficient to reproduce many of the symptoms of DM1 such as dilated cardiomyopathy and muscle wasting [113, 129, 130]. CELF1 is also overexpressed and/or mislocalized in other neuromuscular disorders, supporting the hypothesis that it plays a special role in muscle tissue [95, 104, 131]. Deletion of CELF1 in mice is neonatally lethal in a pure genetic background  and results in greatly reduced viability in mixed genetic backgrounds. Those mice that survive have significantly retarded growth, and compromised fertility in both sexes . Assessment of possible muscle-specific defects in these knockout mice awaits further investigation.
Abnormal CELF1 expression in DM1 has been closely linked with the aberrant splicing patterns observed in patient tissues [129, 130, 134–136]. Indeed, the myotonia seen in patients appears to be entirely due to mis-splicing of the CLCN1 mRNA which encodes a chloride channel . In addition to myotonia, DM1 patients exhibit a large number of additional signs including muscle wasting, cardiac abnormalities, cataracts and cognitive impairment. Given the wide range of other activities CELF proteins possess, it is likely that some aspects of DM1 pathogenesis are independent of mis-splicing. For example, aberrant CELF1 function reduces translation of p21/CDKN1A in DM1 patient cells perhaps explaining impaired differentiation of these cells . There is also compelling evidence that in DM1 mouse models a significant number of dysregulated genes exhibit altered mRNA abundance rather than changes in splicing . Such effects are potentially due to altered transcription and/or mRNA decay. Further work will be required to separate the impact of various CELF1 functions on genes whose expression is altered in this disease.
Myotonic dystrophy also impacts tissues other than muscle, particularly the central nervous system (reviewed in [139–141]). The majority of DM1 patients suffer from excessive fatigue, mental deficits and executive dysfunction while the most severely affected congenital DM1 patients exhibit severe mental retardation. Consistent with these symptoms, white matter lesions and neurofibrillary tangles have been reported in the brains of patients . At least some of these problems are likely to result from direct effects on splicing of the tau mRNA . Tau is a protein that has been linked with many neurodegenerative diseases  thus it is possible that altered splicing of Tau may contribute to neurocognitive deficits associated with DM1. A wide range of other splicing changes have also been detected in DM1 patients [6, 128]. Although there is no direct evidence implicating changes in mRNA decay as a cause of brain pathology in DM1, neuronal cells derived from patients have defects in synaptogenesis and neurite outgrowth that are linked with reduced expression of SLITRK2 and SLITRK4 genes . Such down-regulation could well be induced by changes in mRNA decay.
CELF proteins have also been linked with other neurodegenerative diseases; CELF1 overexpression suppresses the neurodegenerative phenotype in a Drosophila model of Fragile-X-associated Tremor/Ataxia Syndrome, FXTAS , while CELF2 (BRUNOL3) is up-regulated in Spinal Muscular Atrophy (SMA) . The role that CELF proteins play in these disorders remains unclear but it is interesting to note that CELF1 can associate with the CGG-repeat RNA that causes FXTAS through binding to hnRNPA1/B2 . This interaction might sequester CELF1 from its normal cytoplasmic functions. It would be interesting to determine whether CELF1-mediated mRNA decay is affected in FXTAS patient cells.
Finally, advances in high-resolution genome sequencing recently led to the discovery that a neurodevelopmental disorder associated with deletions of 18q12.2 is almost certainly caused by haploinsufficiency of CELF4 . The resulting phenotype includes borderline IQ, behavioral problems, seizures, myopia and obesity. The phenotype is remarkably similar to that of CELF4−/− mice which display a complex seizure disorder [10, 148]. Deletion of CELF4 at seven weeks of age induced convulsive seizures, while earlier loss of the gene resulted in absence-like non-convulsive seizures. These results are consistent with the expression pattern of CELF4 which is restricted to neural tissues in adults, primarily in excitatory neurons . Such phenotypes are particularly interesting in light of the fact that CELF4 has a glutamine-rich domain similar to those found in the prion protein PrP and the RNA-binding protein TDP43 , both of which have also been linked with neurological conditions [150–152].
Although CELF proteins clearly play a special role in muscle and neuronal tissues, an increasing number of studies are linking these RNA-binding proteins with more fundamental cellular functions and particularly with cell proliferation, apoptosis and cancer. Moreover, many of these connections involve regulation of mRNA decay and/or translation.
CELF protein over-expression has been reported in several cancers including esophageal cancer  and glioma . Many of the mRNAs targeted by CELF proteins seem to encode proteins with roles in apoptosis including TP63, AKT, MYD88, SIAH1, UBE2C, RUNX1, SERPINE1 etc. , suggesting that CELF1 is involved in coordinating the apoptotic response. Apoptosis is a cellular mechanism that often goes awry in cancer cells and CELF1 overexpression could therefore be involved in the aberrant apoptotic response in cancer. In esophageal cancer cells, the survivin transcript, which encodes an inhibitor of apoptosis, is stabilized through association with CELF1 . CELF1 has also been implicated in modulating the stability of the mRNA encoding the apoptosis regulator p21/CDKN1A in cancer cells . In these two studies [58, 153], CELF1 over-expression led to undesirable resistance to the chemotherapeutics camptothecin or bortemozib, which kill cancer cells by inducing apoptosis. CELF1 knockdown was shown to have the opposite effect; in HeLa cells, reduced expression of CELF1 resulted in increased sensitivity to apoptosis induced by TNF and cycloheximide . Although further work is needed, together, these studies suggest that CELF1 binds to numerous transcripts involved in apoptosis and its overexpression can inhibit this process in cancer cells.
CELF2 is also associated with apoptosis; it is induced in response to apoptotic stimuli such as gamma irradiation and curcumin [59, 62]. One well-characterized target of CELF2, the COX2/PTGS1 mRNA, encodes a protein with strong anti-apoptotic activity that facilitates survival following genotoxic stress. CELF2 functions to reduce expression of COX2 following irradiation by simultaneously stabilizing the COX2 mRNA and inhibiting its translation [40, 59]. Similar effects on the MCL1 transcript, which also encodes an anti-apoptotic protein, have been reported in colon cancer cell lines  suggesting that CELF2 may coordinate down-regulation of multiple anti-apoptotic factors during stress.
The compelling links between CELF1 and cancer prompted recent investigations of cancer incidence in DM1 patients as aberrant CELF1 function has been established in this group. These studies revealed an increased overall risk of cancer in DM1 patients, especially thyroid cancer, choroidal melanoma and pilomatricoma [155–158]. In addition, indirect links between the abnormal function of CELF1 and liver , breast [59, 160] and blood  cancer pathogenesis have been described.
In conclusion, although studies of the influence of CELF1 and CELF2 proteins in cancer are in their infancy, it is already quite clear that they are important regulators of apoptosis making them viable targets for anti-cancer therapeutics.
In primary human T-cells, CELF1 regulates gene expression observed following T cell receptor-mediated activation. T-cell receptor stimulation induces stimulus-dependent changes in the decay rates of hundreds of mRNA transcripts containing GU-rich elements (GREs) in their 3′UTRs [25, 162]. Identification of the cytoplasmic binding targets of CELF1 before and after T-cell activation led to the observation that CELF1 dissociates from GRE-containing transcripts following immune activation and associates with a different array of targets . This change correlated with the transient up-regulation of GRE-containing mRNAs, many of which encoded proteins necessary for the transition from a quiescent state to a state of cellular immune activation such as JUN, JUNB, PP1CC, ETS2, ELF4, TGIF2, TNFRSF1B etc. . This data supports a model whereby CELF1 functions in resting T-cells to down-regulate a network of transcripts involved in activation and proliferation. Subsequent activation-induced phosphorylation of CELF1 results in stabilization and accumulation of these transcripts within activated immune cells . Thus it appears that CELF1 plays an essential role in facilitating the immune cell response to external stimuli and coordinates regulation of a gene network involved in cellular activation.
Numerous RNA-binding proteins (RBPs) are important for fertility . This is not surprising because during the process of meiosis, transcription ceases, leaving only post-transcriptional mechanisms to regulate gene expression. As mentioned above, CELF1−/− mice have reduced fertility, and in males this coincides with spermatogenesis defects [132, 133]. CELF1 and CELF3 (BRUNOL1) are both highly expressed in testes, and CELF3−/− mice have reduced sperm count and motility but are otherwise normal and fertile . The true influence of CELF proteins in gametogenesis may be obscured by redundancy in their functions. There is still much to be learned as there is clear evidence that these proteins are involved in germ cell development in lower organisms (see below).
CELF proteins are found throughout the animal kingdom as well as in plants, and much has been learned from studying their expression and mutant phenotypes in various models from fruit flies to fish. The critical role of CELF1 during development has been illustrated by knockdown and over-expression experiments. In Xenopus, neutralizing or knocking down expression of CELF1/EDEN-BP caused developmental defects, such as the loss of somitic segmentation  primarily due to overexpression of the Notch pathway protein, XSu(H) . In this case, CELF1 directly associated with the 3′UTR of XSu(H) and modulated deadenylation and subsequent decay of this transcript and likely other transcripts involved in segmentation. In zebrafish, CELF1 overexpression profoundly affected somite symmetry and left-right patterning predominantly through destabilization of the Dmrt2a transcription factor mRNA . The experiments in Xenopus and zebrafish strongly imply that CELF1 is a major coordinator of post-transcriptional events during vertebrate somite segmentation. Moreover, as somites are the precursors of muscle, this indicates that CELF1 may play important roles throughout muscle development. In support of this, genetic deletion of CELF1/ETR-1 in Caenorhabditis elegans caused embryonic lethality with a phenotype similar to that exhibited by mutants affecting muscle formation . This confirms that CELF1 is an important factor in muscle development in diverse species and also suggests that targets of CELF1-mediated mRNA decay may be regulated during somitogenesis.
Another C. elegans CELF protein, UNC-75, is most closely related to CELF3 and CELF4 and like them is expressed primarily in the nervous system . Interestingly, Unc-75 mutants have a phenotype suggestive of defects in synaptic transmission reminiscent of that seen in CELF4 knockout mice. This supports a conserved function for CELF4 in regulating mRNA metabolism in excitatory neurons.
Drosophila has provided a particularly fruitful and perhaps undervalued model for the study of CELF protein function. The Drosophila Bruno protein encoded by the arrest (aret) gene is essential for gametogenesis and mutations in aret result in a complete failure of oogenesis as well as male sterility . Weak aret alleles allow production of viable larvae, but they exhibit complex defects in segmentation patterning. Bruno over-expression causes defects in both antero-posterior and dorso-ventral patterning – a phenotype that is perhaps reminiscent of the patterning defects seen in zebrafish over-expressing CELF1 . Two other Drosophila CELF proteins, Bru-2 and Bru-3, remain virtually uncharacterized.
Taken together the plethora of studies applying knockdown, knockout and over-expression approaches, support that CELF proteins have conserved roles in gametogenesis, and in somitic segmentation as well as special functions in muscle and nerves. Some of these roles may well be redundant as there are multiple CELF genes present in most organisms studied to date. As a result, knockout phenotypes may underestimate the real impact of these proteins in development and disease.
In conclusion, CELF proteins are powerful regulators of mRNA metabolism in many biological pathways from apoptosis to myogenesis, but it seems that we have only glimpsed the tip of the iceberg so far. In particular, in mammalian cells most studies have focused on how CELF proteins modulate splice site choice in muscle cells while their roles in other cell types and mechanisms have been understudied. With the advent of more powerful approaches to assess global mRNA abundance  and mRNA decay rates , we anticipate that the true impact of these proteins on gene expression in cancer, infertility, neuromuscular disease and other conditions will be better appreciated in the future. We expect that studies in model organisms will shed new light on the importance of CELF proteins in early embryonic development and especially in germ cells.
Simply identifying the mRNAs targeted for CELF-mediated decay will not be sufficient to provide much insight into how these proteins function. We need to identify the enzymes, RBPs and miRNAs that CELF proteins interface with and determine how these interactions are altered in diseased tissues. We especially need to work towards characterizing the post-translational modifications that CELF proteins experience, and isolating the enzymes responsible for their addition and removal. Such enzymes may represent invaluable targets for therapeutics. In addition, it will be important to assess how each post-translational modification influences RNA-binding affinity and possibly interactions with other proteins and enzymes. This is particularly important in light of the continuing debate as to whether and how CELF1 recognizes CUG repeats as well as GREs.
While CELF1 and CELF2 have been quite extensively investigated, the other CELF proteins are understudied, especially with regards to possible roles in mRNA decay. Studies in neural cell lines and tissues might be particularly informative for CELF3 and CELF4. The CELF proteins that have more restricted expression patterns could play important roles in many neurological and neuromuscular conditions, including DM1, but this simply has not been investigated. Building of biological networks and pinning down specific interactions would be a tremendous asset to understanding how these RBPs are involved in pathogenesis.
The links between CELF proteins and the stress response are also in need of more focused experimentation. As we discussed, CELF1 relocalizes to SGs and interacts with the TIA1 RNA-binding protein that initiates assembly of these cytoplasmic bodies. It would be interesting to know how this affects stability of CELF1 target transcripts on a global scale, and also whether CELF1 facilitates decay or translation as the stress is resolved . Moreover, as there is evidence for formation of cytoplasmic granules during differentiation of both T-cells  and muscle cells , similar effects may be occurring during stress responses and normal developmental changes.
C.J.W. is supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health under Award Number AR059247. P.R.B. and I.A.V-S. are supported by NIH grant AI072068. A.M.D. is supported by a postdoctoral fellowship from the Myotonic Dystrophy Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the Myotonic Dystrophy Foundation. We thank Dr. Ying Zhang (University of Minnesota Supercomputing Institute) and Dr. Eric Ross (Colorado State University) for helpful discussions regarding protein structures and prion domains.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.