|Home | About | Journals | Submit | Contact Us | Français|
RNAs in cells are associated with RNA-binding proteins (RBPs) to form ribonucleoprotein (RNP) complexes. The RBPs influence the structure and interactions of the RNAs and play critical roles in their biogenesis, stability, function, transport and cellular localization. Eukaryotic cells encode a large number of RBPs (thousands in vertebrates), each of which has unique RNA-binding activity and protein-protein interaction characteristics. The remarkable diversity of RBPs, which appears to have increased during evolution in parallel to the increase in the number of introns, allows eukaryotic cells to utilize them in an enormous array of combinations giving rise to a unique RNP for each RNA. In this short review, we focus on the RBPs that interact with pre-mRNAs and mRNAs and discuss their roles in the regulation of post-transcriptional gene expression.
In prokaryotes, transcription and translation are physically coupled. In eukaryotes, these two processes occur in separate compartments, the nucleus and the cytoplasm, respectively. This allows eukaryotes to carry out extensive post-transcriptional processing of pre-mRNA that produces a more diverse assortment of mRNAs from its genome and provides an additional layer of gene regulation. The pre-mRNA processing reactions, including splicing, editing and polyadenylation, commence as soon as pre-mRNAs emerge from their sites of transcription and are mediated by RBPs and trans-acting RNAs, themselves present as RNPs (e.g. snRNPs). Although all RBPs bind RNA, they do so with different RNA-sequence specificities and affinities. This activity is mediated by a relatively small number of RNA-binding scaffolds whose properties are further modulated by auxiliary domains. The auxiliary domains can also mediate the interactions of the RBP with other proteins and, in many cases, are subject to regulation by post-translational modification. As a result, cells are able to generate numerous RNPs whose composition and arrangement of components is unique to each mRNA and the RNPs are further remodeled during the course of the maturation of the mRNA into its functional form. While our focus here is on the RBPs that are associated with pre-mRNAs and mRNAs, we note that many RBPs are associated with other classes of RNAs (for a recent review see ), and all of these are important for cell physiology (Figure 1). Many of the features of RBPs that we discuss, however, are general and also apply to RBPs that are part of many different types of RNPs. In the following, we discuss select examples that illustrate general principles of the biochemistry and cell biology of RBPs to highlight their central role in gene expression.
The discovery of the heterogeneous nuclear ribonucleoproteins (hnRNP) and other pre-mRNA/mRNA-binding proteins led to the identification of the first amino acid motifs and functional domains that confer binding to RNA . RBPs contain one or, more often, multiple RNA-binding domains. Some well-characterized RNA-binding domains include the following: RNA-binding domain (RBD, also known as RNP domain and RNA recognition motif, RRM); K-homology (KH) domain (type I and type II); RGG (Arg-Gly-Gly) box; Sm domain; DEAD/DEAH box; zinc finger (ZnF, mostly C-x8-X-x5-X-x3-H); double stranded RNA-binding domain (dsRBD); cold-shock domain; Pumilio/FBF (PUF or Pum-HD) domain; and the Piwi/Argonaute/Zwille (PAZ) domain (Figure 2) (for review see [3,4]). Using these motifs, bioinformatic analyses revealed that eukaryotic genomes encode a large number of RBPs. In yeast, 5–8% of genes encode proteins predicted to function as RBPs, and in C. elegans and D. melanogaster, approximately 2% of the genome is annotated to encode RBPs [5–7]. However, it is likely that the number of RBPs is much higher, since there are probably other RNA-binding domains that remain to be uncovered. Why do eukaryotes need so many – hundreds and perhaps thousands of – RBPs? One possible explanation is that as eukaryotes evolved highly specific post-transcriptional processes to fine-tune gene expression, a concomitant expansion of the number of RBPs needed to function in these processes has occurred . For example, in both vertebrates and plants, the emergence of alternative splicing during evolution drove the need for a corresponding increase in the number of RBPs .
It is certain that many RBPs remain yet to be characterized. Several methods have been developed to identify and characterize the RBPs and the RNAs with which they interact. The hnRNP and mRNP complexes were initially isolated by ultraviolet (UV) cross-linking of RNA-protein complexes in vivo [9–15]. This is a reliable and effective method to detect RNA-protein interactions, as it circumvents the adventitious association of proteins with RNAs that could occur after cell lysis . Recently, this method has been adapted, using tagged proteins and including an immunoprecipitation step following cross-linking (cross-linking and immunoprecipitation or CLIP) . Procedures to detect and delineate RNA-protein interactions in vitro include systematic evolution of ligands by exponential enrichment (SELEX) and electrophoretic mobility shift assay (EMSA) . A yeast-three hybrid system has been devised as a screening method to identify RBPs and their target RNAs [19–21]. Several approaches have been utilized to identify RNA targets. For example, the RIP assay, which combines reversible cross-linking with formaldehyde followed by immunoprecipitation and RT-PCR, has been used to identify hepatitis delta antigen (HDAg) interactions with HDV RNAs and U1 snRNP protein-RNA interactions . An affinity tag may also be introduced to facilitate the isolation of an RBP of interest, followed by analysis of associated RNAs using microarrays, an approach that has been successfully used to identify RNAs that associate with PUF proteins in S. cerevisiae . Bioinformatics approaches can also be used to identify RNA targets if a consensus and non-degenerate RNA-binding sequence is known. In addition, traditional genetic approaches and reverse genetics can be employed to identify both RBPs and their target RNAs. For example, RNAi screening in cultured D. melanogaster cells using a candidate gene approach has been successfully used to examine which RBPs are involved in alternative splicing . Taken together, a considerable array of technologies is now available to discover and further study the many RBPs that bioinformatics predicts to be present.
At the structural level, RBPs often exhibit a high degree of modularity, as most contain one or more RNA-binding and auxiliary domains (for review see ). This modularity creates both RNA-binding and functional diversity within the RBPs. The most extensively studied RNA-binding domain, the RBD, is often found as multiple repeats within a single protein, exemplified by the polypyrimidine tract-binding protein (PTB/hnRNP I), poly(A) binding protein (PABP), U2AF65 and U1A . Although a single RBD, which typically can bind 2 – 6 nucleotides, is sufficient for binding RNA, having multiple copies of this domain enables the recognition of larger, more complex RNA targets, enhancing the specificity and affinity of binding . A similar principle is found in PUF proteins. These typically contain eight consecutive Puf RNA-binding repeats, each of which consists of approximately 40 amino acids that form three α-helices [26–28]. The crystal structure of human Pumilio bound to RNA revealed that each of the eight repeats recognizes a single nucleotide in its target RNA, to bind a total of eight consecutive nucleotides . This specific and high affinity interaction, in combination with its modular design, enables a unique and remarkably predictable PUF-RNA interaction that can be exploited to engineer proteins that bind sequences other than wild-type [27,29,30].
A further diversity of RBPs is achieved by combining RNA-binding domains with auxiliary functional domains. ADAR2 and PKR are two RBPs that have similar RNA-binding domains, the dsRBD, but differ in their auxiliary domains and their associated functions. ADAR2 combines its dsRBD with a deaminase domain that converts adenosine to inosine in its target RNAs, while PKR incorporates a kinase domain [31,32]. As PKR binds double-stranded RNA, it is converted to an active state where subsequent autophosphorylation triggers many downstream events . The dsRBD of PKR is thus able to autoregulate its kinase domain due to the modularity of its structure.
Alternative splicing is yet another mechanism by which cells can expand its repertoire of RBPs. For example, alternative splicing of the polypyrimidine tract binding protein (PTB/hnRNP I) mRNA generates a splice variant that lacks the first two RBDs, and the corresponding PTB isoform may affect the stability of the CD154 mRNA . Another example of alternatively spliced RBPs is the poly(C) binding protein family, which includes hnRNPs K/J and the αCPs (αCP−1 to −4) (for review see ). HnRNP K appears to have at least four alternative splice variants . αCP-2 and αCP-4, two KH domain RNA-binding proteins, are both alternatively spliced . However, for these examples, isoform-specific functions remain to be determined.
Post-translational modification of RBPs generates additional layers of complexity, as it can modify the RNA-binding, function and localization of the RNP. Three types of modifications have been described for RBPs: phosphorylation, arginine methylation and small ubiquitin-like modification (SUMO). Phosphorylation of αCP-1 and αCP-2 decreases their poly(rC)-binding activity . Growth factors, oxidative stress and other stimuli can alter the phosphorylation status of hnRNP K [38–40]. Methylation of RGG repeats is found in several RBPs, including the hnRNPs (for review see ). In S. cerevisiae, two RBPs involved in mRNA processing and export, Hrp1 and Yra1 (Aly/REF in metazoans), have been shown to be methylated by the major type I arginine methyltransferase, Hmt1 [42,43]. It is possible that this methylation plays a role in the formation of Hrp1- and Yra1-containing RNPs. SUMO modification of hnRNP C and hnRNP M results in conformational and/or compositional changes in these RNPs at the nuclear pore and could therefore play a role in the regulation of nucleocytoplasmic transport .
Cell and developmental specific expression also serves to alter the stoichiometry of a cell’s RBPs. Changes in the relative amounts of hnRNP A/B proteins have been suggested to regulate alternative splicing, for example that of the 4.1R transcript during mouse erythropoiesis . Specifically, the hnRNP A/B proteins interact with a conserved splicing silencing element (CE16) in exon 16 (E16) of the 4.1R transcript, leading to increased exclusion of E16. In turn, down-regulation of hnRNP A/B proteins during erythropoiesis correlates with E16 inclusion. This illustrates the importance of RBPs as modulators of a process, in this case alternative splicing, in the broader context of cellular differentiation.
RBPs function in every aspect of RNA biology, from transcription, pre-mRNA splicing and polyadenylation to RNA modification, transport, localization, translation and turnover. The RBPs not only influence each of these processes, but also provide a link between them [46–49]. Proper functioning of these intricate networks is essential for the coordination of complex post-transcriptional events, and their perturbation can lead to disease.
At least 74% of human genes express multiple mRNAs through alternative splicing . RBPs also function in the regulation of this process. For example, the neuronal specific Nova proteins, each containing three KH domains, control the alternative splicing of a subset of pre-messenger RNAs (e.g. gephyrins 1–2, JNK2, flamingo 1, neogenin) by recognizing intronic YCAY elements (Y indicates a pyrimidine, U or C). The majority of Nova target mRNAs encode proteins that function in the synapse thus linking Nova proteins to the regulation of factors involved in maintaining neuronal plasticity. Loss of Nova proteins, as a result of autoimmune paraneoplastic neurologic disorder (PND), manifests itself in neurologic symptoms of excess motor movements (Paraneoplastic Opsoclonus Myoclonus Ataxia, POMA) [51,52]. The TAR DNA binding protein (TDP43), which interacts with (UG)6–12 motifs in single-stranded RNA through its two RBDs , is involved in the regulation of splicing of the cystic fibrosis CFTR (cystic fibrosis transmembrane conductance regulator) mRNA, which encodes a Cl− channel . TDP43 binds an extended stretch of UG repeats in a (UG)U-rich polymorphic region upstream of the 3′ splice site in intron 8, which causes exon 9 skipping in the CFTR mRNA, consequently producing nonfunctional chloride channels in patients with cystic fibrosis [53,54]. In the case of CFTR, the repeats in the transcript affect the function of the encoded protein. However, there are a number of diseases associated with repeats where the aberrant RNA mediates the disease by a gain-of-function mechanism. This is the case for myotonic dystrophy (DM). DM type I (DM1) is caused by a CUG triplet-repeat expansion (from 50 to >1500 repeats) in the 3′UTR of the DMPK mRNA [ 55,56]. This mutant mRNA is retained in the nucleus through its interaction with two splicing regulators, muscleblind-like protein 1 (MBNL1) and CUG-binding protein 1 (CUG-BP1) [56,57], causing splicing defects. MBNL1 becomes sequestered on the mislocalized repeat-containing RNAs which results in nuclear depletion and loss of function . CUG-BP1 steady state-levels, on the other hand, are increased in DM1 due to hyperphosphorylation of the protein . The resulting change in the ratio of MBNL1 to CUG-BP1 is correlated with aberrant splicing of their target pre-mRNAs .
RNA editing is the most prevalent type of RNA modification, involving the conversion of adenosine (A) to inosine (I). This post-transcriptional modification changes an RNA’s nucleotide content through the deamination of A to I, in a reaction catalyzed by the ADAR proteins . This processing results in an RNA sequence that is different from that encoded by the genome and extends the diversity of the gene products. While the majority of RNA editing occurs in non-coding regions, a few genes have been identified that are subject to editing in their coding sequences . The pre-mRNA substrate required by an ADAR enzyme is often an imperfect duplex RNA formed by base-pairing between the exon that contains the adenosine to be edited and an intronic non-coding element . A classic example of A-I editing is the glutamate receptor GluR-B mRNA, where a glutamine at the editing site is converted to an arginine. This modification changes the conductance properties of the altered channel . Most of the A-I modifications described to date are limited to transcripts in the nervous system encoding ion channels, G-protein coupled receptors and the glutamate and serotonin receptors . Mutations in the Drosophila ADAR gene result in neuronal dysfunction, whereas a homozygous Adar null mutation in mice results in embryonic lethality [63–65]. In humans, a heterozygous functional-null mutation in the ADAR1 gene is less severe and leads to a skin disease, human pigmentary genodermatosis .
Polyadenylation of an mRNA has a strong effect on its nuclear transport, translation efficiency and stability, and all of these, as well as the process of polyadenylation, depend on specific RBPs. All eukaryotic mRNAs, with the exception of replication-dependent histone mRNAs, are processed to receive 3′ poly(A) tails of ~ 200 nucleotides. Polyadenylation is a tightly coupled two-step process in which the transcript is first cleaved between the highly conserved AAUAAA sequence upstream and a degenerate U/GU rich sequence downstream of the cleavage site, after which the poly(A) polymerase adds the poly(A) tail to the cleavage product . One of the necessary protein complexes in the polyadenylation process is CPSF, which consists of at least 4 polypeptides and binds the canonical AAUAAA site, of which CPSF-160 and CPSF-30 appear to be the key RNA-binding subunits . CPSF, together with the nuclear poly(A) binding protein (PABPN1), stimulates the activity of the poly(A) polymerase, which is essentially inactive on its own . For PABN1 to interact with the poly(A) tail it needs both the RBD and the arginine-rich C-terminal domain . Short GCG expansions in the coding region of PABPN1 mRNA have been found to cause oculopharyngeal muscular dystrophy (OPMD) . These triplet-repeats give rise to an expanded polyalanine tract in the protein that likely causes mutated PABPN1 oligomers to accumulate as filament inclusions in the nuclei of skeletal muscle fibers, thus eliciting nuclear toxicity . PABPN1 is post-translationally modified by arginine methylation, and it was recently shown that unmethylated PABPN1 oligomerizes more readily than methylated PABPN1 [72,73]. This suggests that the methylation state of the protein also influences the extent of nuclear aggregation in OPMD.
Normally, once pre-mRNA processing is complete, the translation-ready mRNA is exported from the nucleus to the cytoplasm. The cell therefore requires a mechanism to ensure that only fully processed mRNPs are exported. That is, transcription, splicing and 3′end processing of the mRNAs must be completed before export can occur. mRNA export is an excellent example of the dynamic network of rearrangements in which RBPs participate. It is a three step process involving the generation of a cargo-carrier complex in the nucleus, followed by translocation of the complex through the nuclear pore complex, and finally, release of the cargo in the cytoplasm with subsequent recycling of the carrier. The TAP/NXF1:p15 heterodimer is a key player in mRNA export. TAP (known as Mex67 in S. cerevisiae) was first shown to bind to the constitutive transport element (CTE), an element required for export of retroviral transcripts, and it was later demonstrated that TAP also has a role in mRNA export [74,75]. Overexpression of TAP in Xenopus oocytes increases the export of transcripts that are otherwise inefficiently exported suggesting a direct role for TAP in mRNA export. As both TAP and p15 show low affinity for RNA, they require adaptor proteins to mediate the interaction [76,77]. The Aly/REF protein which directly interacts with TAP, recruits TAP to mRNA, although the precise mechanistic details of mRNA export remain unclear [78,79].
mRNA localization is critical for gene expression by allowing spatially regulated protein production. Localization of transcripts to a specific region of the cell during development has been particularly well studied in S. cerevisiae and D. melanogaster. For example, during cell division in S. cerevisiae, ASH1 mRNA is actively localized to the bud of the daughter cell by its association with myosin (Myo4) and actin . This interaction depends on two other proteins, She2 and She3 . She2 binds as a dimer to localization elements located partly in the coding region and in the 3′UTR of the ASH1 mRNA . Binding to RNA increases the affinity of She2 for the C-terminus of She3, which then binds Myo4 through its N-terminus . The resultant localized expression of the Ash1 protein is necessary for the suppression of mating type switching in the daughter cell by repressing the transcription of the HO endonuclease gene . Another example that highlights how nuclear-acquired factors impact cytoplasmic mRNA metabolism is the localization of β-actin to the lamella region in several asymmetric cell types by the zipcode-binding protein (ZBP1) [83,84]. ZBP1 contains four KH domains and one RBD. It binds to β-actin mRNA at the site of transcription through a 54 nt localization element in the 3′ UTR of β-actin, termed the zipcode, and moves with the mRNA into the cytoplasm. This interaction is essential for proper β-actin mRNA localization in the cytoplasm [83,84].
Translational regulation provides a rapid mechanism to control gene expression, and numerous regulatory proteins target the initiation step, often in a way that couples translation to mRNA localization. ZBP1, in addition to its role in the localization of β-actin mRNA, is involved in the translational repression of β-actin mRNA by blocking translation initiation . It is thought that phosphorylation of ZBP1 by the Src tyrosine kinase leads to decreased binding affinity to β-actin mRNA, and ultimately derepression of translation . The dual role of ZBP1 makes it a valid candidate in linking transport and translational repression of β-actin mRNA.
Many species depend on distinct regulatory systems to keep mRNAs translationally silent during different stages of development. In the C. elegans germ line, for example, the KH domain protein GLD-1 represses the translation of pal-1 mRNA by binding to a germline repression element (GRE) in its 3′UTR . The PAL-1 protein initiates a transcription regulatory network in the later blastomere lineages, and therefore needs to be translationally repressed in oocytes and early embryos .
Translation is tightly coupled to mRNA turnover and regulated mRNA stability. The ELAV/Hu proteins are involved in the stability and translation of early response gene and AU-rich transcripts predominantly in neurons . HuB, HuC and HuD are neuron-specific ELAV proteins, whereas HuR is ubiquitously expressed . Each contains three RBDs, the first two of which confer binding to AU-rich elements (AREs) . These proteins stabilize many of their AU-rich target mRNAs (e.g. c-fos, GM-CSF, EGF) . HuR appears to be stabilizing its target transcripts by protecting the messages from degradation in the cytoplasm . In addition, HuR colocalizes with polysomes, suggesting that it binds to ARE-containing mRNAs undergoing translation . Patients with paraneoplastic neurological disorder (PND) develop autoantibodies against HuC and HuD in tumors outside of the central nervous system [51,52]. These antibodies, as well as inflammatory cells, are able to cross the blood-brain barrier resulting in PND-associated encephalomyelitis and neuronopathy .
Many RBPs, for example the abundant hnRNP and serine/arginine-rich (SR) proteins, bind to multiple sites on numerous RNAs to function in diverse processes. The hnRNP A1 protein can bind to exonic splicing silencer sequences and regulate alternative splicing by antagonizing the SR splicing factors . Additionally, hnRNP A1 has been shown to stimulate telomerase activity by associating with telomere ends . Recently, hnRNP A1 was found to bind to human pri-mir18a, the precursor of miR-18a, and to facilitate its Drosha-mediated processing . This is the first time an RBP has been implicated in miRNA maturation.
RNA-protein, and hence the sequence or structure of the RNA target, and protein-protein interactions are critical factors in determining the formation of an RNP. However, often more than one RBP has the capacity to bind to a specific sequence on the target RNA. The complement of RBPs present at a particular locale where the RNA is transcribed or changes in the post-translational modifications of these proteins would affect the resulting RNP complex, modulating its downstream functional activity. The recruitment of additional proteins to the RNP can result in the regulated formation of a highly dynamic complex. Here, we discuss two well-characterized examples of RNP assembly, the exon-junction complex (EJC) and the CPE-binding protein (CPEB) RNP.
The EJC is a large (~335 kDa in vitro) RNP that preferentially binds mRNAs produced by splicing [95–97]. It binds these newly spliced mRNAs approximately 20–24 nucleotides upstream of exon-exon junctions [96,97]. Proteins known to comprise the core EJC include eIF4AIII, Y14, magoh and MLN51/Barentsz [98–100]. Other proteins that associate with the core complex include RNPS1, SRm160, Aly/REF, PYM and Upf3 [95,97,100–107]. Of the core components, the best characterized interaction occurs between Y14 and magoh, two proteins that are found in spliceosomes following the first step of splicing [108,109]. While Y14, which contains an RBD, was initially a candidate for binding mRNA directly, the crystal stucture of human and Drosophila Y14:magoh revealed quite unexpectedly that the RBD is masked through its interaction with magoh and, thus, appears unable to directly contact mRNA [95,110–112]. Rather, it is more likely that eIF4AIII, a spliceosome-associated ATP-dependent DEAD-box RNA helicase, acts as the central factor in the initiation of EJC formation because it binds specifically to mRNA during the late stages of splicing and also binds Y14:magoh, perhaps serving to recruit these proteins to the exon-exon junction [98–100,107,113]. The EJC also enhances the association with the mRNP of the mRNA nuclear export factor TAP/NXF1:p15, and this promotes its transport of the complex through the nuclear pore to the cytoplasm [78,105,106,114,115]. Interestingly, while most EJC proteins dissociate from the mRNA either during or immediately following export, Y14 and magoh remain bound to the mRNA until it is translated, suggesting that they may play an additional role in this process . Consistent with this idea, tethering of Y14, magoh or RNPS1 to mRNA can enhance the translation efficiency . Recently, PYM has been shown to bind the cytoplasmic Y14:magoh complex, in addition to the 48S preinitiation complex and the small (40S) ribosomal subunit . In addition, knockdown of PYM results in a decrease in translation of spliced mRNAs . This data suggests that PYM may deliver spliced mRNAs containing the EJC to the translational apparatus in the cytoplasm to enhance protein production. The EJC can also serve as a marker to indicate mRNAs that have premature stop codons located upstream of the EJC (for review see ). Upf3, a protein that functions in nonsense-mediated decay (NMD), is also a component of the EJC [105,120]. It appears that information encoded by mRNAs, such as the presence or absence of premature termination codons, may be marked in the nucleus for subsequent communication to the translation or NMD apparatus in the cytoplasm. The dynamic nature of the association of proteins with mRNA in the EJC RNP over a history of processes – from splicing to export, translation and mRNA degradation – implies that this highly ordered RNP assembly is important for timely and coordinated gene expression.
The CPE-binding protein (CPEB) RNP is a large, dynamic complex that functions in cytoplasmic polyadenylation and translational regulation (for review see ). CPEB itself contains two RNA-binding domains, an RBD and a zinc finger, and is highly conserved in both vertebrates and invertebrates [122,123]. In the cytoplasm, CPEB first binds to the cytoplasmic polyadenylation element (CPE; UUUUUAU consensus sequence), located within the 3′ UTR of some mRNAs, and then initiates the assembly of an RNP complex that contains the following proteins: CPSF; PARN, a deadenylating enzyme that contains two RBDs; Gld2, a poly(A) polymerase; and symplekin [124–128]. When the CPE-containing mRNA is translati onally repressed, PARN deadenylation is more active than Gld2 polyadenylation, resulting in shortening of the poly(A) tail . However, in the case of oocyte maturation, phosphorylation of CPEB Ser174 by Aurora A kinase results in the dissociation of PARN from the RNP complex, allowing Gld2 polyadenylation of the CPE-containing mRNA [128,129]. The maskin protein appears to provide a direct link to translation regulation and the CPEB RNP. Maskin interacts with both CPEB and the cap-binding factor eIF4E [130,131]. When the poly(A) tail is short, maskin binds eIF4E, therefore occluding the binding of eIF4G. As a result, the 40S ribosomal subunit cannot be recruited to the mRNA and translation is repressed. However, when the poly(A) tail is elongated, the poly(A)-binding protein (PABP) binds the poly(A) tail and interacts directly with eIF4G to abrogate maskin’s interaction with eIF4E, allowing the mRNA to be translated [130,132]. Subtle changes in the protein composition and mRNA polyadenylation status of this cytoplasmic RNP complex can determine the fate of the mRNA to which it is bound. These changes ultimately dictate whether the poly(A) tail in the CPE-containing mRNA is deadenylated and therefore translationally repressed, or is elongated and consequently subject to translation initiation. The repertoire of RBPs that binds a particular RNA is often highly influential in determining which RNPs form and, ultimately, the functional roles they play.
Since the definitive identification of the hnRNP proteins and the discovery of the first consensus motifs in RBPs more than two decades ago, the list of RBPs and the multitude of functions in which they participate has expanded enormously. In recent years, biochemical and genetic experiments as well as bioinformatic analysis of several sequenced genomes revealed a vast array of RBPs about which little is known. It is very likely that the inventory of RBPs is much larger, as it is doubtful that all of the RNA-binding motifs have already been discovered. From what has been learned so far, it is clear that RBPs are critical components of the gene expression pathway in eukaryotes. Their capacity to regulate every aspect of the biogenesis and function of RNAs is remarkable. It is also clear, however, that a great deal of information is lacking about the structure of RBPs, their mode of interaction with RNAs and the specific arrangements of these proteins in the complex RNP assemblies that they form on pre-mRNAs and mRNAs. Given the impressive progress that has already been made, the enormous number of RBPs that remain to be characterized and the rich arsenal of tools available to study them, the promise of what the study of RBPs still has in store for understanding biology and many diseases is tremendously exciting.
We thank the members of our laboratory, especially Dan Battle, Mumtaz Kasim, Chi-kong Lau, Lili Wan and Ihab Younis for helpful discussions and comments on this manuscript, and Sharon Kontra for secretarial assistance. G.D. is an investigator of the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.