|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: JMP-C. Performed the experiments: YM SM-L PR-L JMP-C. Analyzed the data: CS YM SM-L JMP-C. Wrote the paper: CS JMP-C.
Pub1p, a highly abundant poly(A)+ mRNA binding protein in Saccharomyces cerevisiae, influences the stability and translational control of many cellular transcripts, particularly under some types of environmental stresses. We have studied the structure, RNA and protein recognition modes of different Pub1p constructs by NMR spectroscopy. The structure of the C-terminal RRM domain (RRM3) shows a non-canonical N-terminal helix that packs against the canonical RRM fold in an original fashion. This structural trait is conserved in Pub1p metazoan homologues, the TIA-1 family, defining a new class of RRM-type domains that we propose to name TRRM (TIA-1 C-terminal domain-like RRM). Pub1p TRRM and the N-terminal RRM1-RRM2 tandem bind RNA with high selectivity for U-rich sequences, with TRRM showing additional preference for UA-rich ones. RNA-mediated chemical shift changes map to β-sheet and protein loops in the three RRMs. Additionally, NMR titration and biochemical in vitro cross-linking experiments determined that Pub1p TRRM interacts specifically with the N-terminal region (1–402) of yeast eIF4G1 (Tif4631p), very likely through the conserved Box1, a short sequence motif neighbouring the Pab1p binding site in Tif4631p. The interaction involves conserved residues of Pub1p TRRM, which define a protein interface that mirrors the Pab1p-Tif4631p binding mode. Neither protein nor RNA recognition involves the novel N-terminal helix, whose functional role remains unclear. By integrating these new results with the current knowledge about Pub1p, we proposed different mechanisms of Pub1p recruitment to the mRNPs and Pub1p-mediated mRNA stabilization in which the Pub1p/Tif4631p interaction would play an important role.
RNA binding proteins (RBP) play essential regulatory roles during different stages of mRNA metabolism. Poly(A)-binding protein 1 (Pab1p) ,  and poly(U)-binding protein 1 (Pub1p) ,  are among the most abundant mRNA binding proteins in Saccharomyces cerevisiae. These two proteins recognise RNA sequences with high affinity and specificity by using modular architectures that contain several RNA binding domains, a strategy commonly used by RBPs. Pab1p and Pub1p contain respectively, four and three RNA-recognition motifs (RRMs), a 80–150 amino acids domain very abundant in higher eukaryotes , , . RRMs can also function as protein-protein recognition domains, as has been already shown in several complex structures (reviewed in ). The second RRM of Pab1p interacts with Tif4631p and Tif4632p, the eIF4G1 and eIF4G2 homologues in yeast , , promoting the formation of a “closed-loop” structure for the mRNA, which stimulates translation initiation by facilitating mRNA recruitment to pre-initiation complexes , , . The “closed-loop” mRNP is recruited to the 43S pre-initiation complex (43S PIC) to form the 48S PIC and represents a crucial checkpoint in the regulation of translation initiation (recently reviewed in ). Proteins that bind to eIF4E or Pab1p can lead to destabilization of the “closed-loop” mRNP causing translation initiation arrest , .
Pub1p biological function antagonizes the Pab1p one. This protein is not associated with polysomes, but rather seems to form part of the translationally inactive mRNAs pool . Pub1p is homologous to mammalian HuR and TIA-1/TIAR proteins, which modulate mRNA stability by blocking different degradation pathways. Pub1p also causes changes in mRNA turnover, majorly stabilization , which is regulated by Pub1p recognition of RNA sequences in the 3′- and 5′-UTRs. For example, Pub1p recognises stabilizer elements (STE) in the 5′-UTR protecting GCN4 and YAP1 mRNAs against the NMD pathway , but can also bind ARE-sequences in the 3-UTR preventing deadenylation-meditated mRNAs degradation . Moreover, Pub1p and TIA-1 proteins play also an important role in translational repression , . Protein expression arrest is an universal mechanism of response to stress (recently reviewed in ) and is frequently regulated by cellular signalling pathways that lead to post-translational modification of proteins involved in translation initiation: eIF2α and 4E-BPs among them . Microscopically, stress-induced translational arrest causes the accumulation of most of the mRNAs into cytoplasmic RNA granular structures such as stress granules (SG) and processing bodies (PB) in both yeast and mammalian cells , , , , , . In general, cytoplasmic mRNPs are in a dynamic equilibrium between actively translated forms (i.e. polysomes) or arrested states (i.e. SG, PB and other RNA granules). Different stressors cause inhibition of translation initiation at different stages, altering this balance and causing the accumulation of different types of RNA stress granules. Formation of stress granules has been reported in Saccharomyces cerevisiae upon glucose starvation (also known as EGP-bodies for eIF4E,G and Pab1p) , , , , robust heat shock , arsenite , sodium azide  and high ethanol levels . Pub1p is required for the formation of the glucose deprivation RNA granules (i.e. EGP-bodies) , , . In addition to Pub1p, this type of RNPs contain Pab1p, Tif4631p, Tif4632p, Tif45p, Pbp1p, Ngr1p, Ygr250c, Gbp2p and Nrp1p proteins, but lack components of the 43S pre-initiation complex (including the 40S ribosomal subunit and eIF3), an important difference with the mammalian SG , . Heat shock stress granules have been recently reported in yeast  and are compositionally more similar to mammalian SG. However their mechanism of assemble is independent of eIF2α phosphorylation, in contrast to the mammalian ones. Like in the EGP-bodies the heat shock stress granules contain Pub1p. The influence of this protein in the nucleation of SG is likely due to the presence of a prion-like C-terminal domain, which in the case of TIA-1 has been probed to be necessary for in the SG formation , , . Nam8p and Ngr1p, the other S. cerevisiae TIA-1 homologues , , also contain this prion-like domain. Nam8p plays a role in mRNA splicing  that resembles that of TIA-1 , , , ,  while Ngr1p is the yeast homolog of TIAR .
Here we have studied the structure and protein-RNA and protein-protein recognition properties of different regions of Pub1p by NMR. The protein contains two RRMs at the N-terminus whose X-ray structure has been recently reported . We have assigned the NMR spectra of this region and determined that it binds prefentially poly(U) sequences with both RRMs participating in the RNA recognition. We have also determined the solution structure of the C-terminal RRM domain, which adopts an atypical fold with a novel N-terminal helix that is chiefly stabilized by the interactions with a key tryptophan residue. Remarkably, these unique features seem to be exclusive of the Pub1p/TIA-1 protein family. This single RRM displays low micromolar affinity for U-rich and UA-rich sequences with a slight preference for the latter ones and the recognition mediated by the β-sheet and the loops. Furthermore we have discovered that the Pub1p C-terminal domain recognises specifically the N-terminal region (residues 1–402) of Tif4631p. We have characterized this interaction by NMR spectroscopy, following changes in signals of both Pub1p and Tif4631p, and by in vitro protein-protein cross-linking experiments with glutaraldehyde. NMR chemical shift mapping experiments identify a novel protein-protein recognition surface, which involves residues conserved in Pub1p/TIA-1 proteins. Comparative cross-linking experiments on a Pub1p mutant confirm the interaction and validate the NMR-identified interface. Additional NMR experiments that monitor Tif4631p (1–402) signals allowed us to assign the Pub1p binding site to a conserved peptide box (Box1) that contains the single Trp residue (Trp 95) of the molecule. The new structural information is integrated with previous biochemical data to propose possible models of Pub1p-mediated mRNA stabilization and translational control
Saccharomyces cerevisiae Pub1p is a 453 residue protein that contains three RRM domains. We cloned three fragments of this protein: Pub1p R13 containing RRM1, RRM2 and RRM3 domains (residues 32–414), Pub1p R12 including RRM1 and RRM2 domains (residues 32–242) and Pub1p R3 that contains RRM3 (residues 315–414). The constructs were expressed as thioredoxin A fusions (N-terminal) to enhance protein expression and solubility. Nevertheless, only Pub1p R12 and Pub1p R3 expressed as soluble proteins and remained so after tag removal. Pub1p R3 is less soluble (300 µM) than the Pup1p R12, and becomes even less soluble at pH below 7.0, maybe due to contributions of histidine side chains titrating in this pH range. Pub1p R13 construct expressed into inclusion bodies, perhaps due to the asparagine/methionine-rich region in the RRM2-RRM3 linker. Other aggregation-prone sequences (like the N- and C-terminal poly-glutamine regions) were not included in the designs. Remarkably, prion-like regions are required for mammalian stress granule assembly in TIA-1 , , the metazoan homolog of Pub1p.
Pub1p R12 and Pub1p R3 constructs are monomeric and well-folded according to the appearance of their 1H-15N HSQC spectra (Figure 1), proton amide T2 relaxation measurements, gel filtration chromatography and, in the case of Pub1p R3, sedimentation equilibrium ultracentrifugation experiments (data not shown). The chemical shift index CSI  provides evidences about the secondary structure contents of these proteins (Figure 1C). Crosspeaks corresponding to linker residues between RRM1 and RRM2 (149–157) are missing in the 1H-15N HQSC spectrum, evidencing that this segment might be conformationally flexible in the ms to µs time scale. Interestingly, the structure of a Pub1p construct similar to ours (residues 75–240) shows that this linker participates in the formation of a crystallographic swapped dimer . Our studies found no evidence about the presence of such dimer in solution, in agreement with the analysis of gel filtration data performed by the authors of crystal structure. Nevertheless, the line broadening effect might be explained with the hypothesis of a chemical equilibrium between a major monomeric and a minor dimeric forms occurring in an intermediate time scale (ms to µs). Finally, the first N-terminal 41 residues (32–72) of Pub1p R12 (not included in the X-ray construct) are conformationally disordered and highly mobile as derived from the chemical shifts and sharp line shapes.
For Pub1p R3, the most remarkable feature extracted from the 13C chemical shift analysis is the prediction of a non-canonical helical region right before the first strand of RRM3. Hence we decided to pursue a full structural determination of this protein domain.
Pub1p R3 NMR structure (PDB: 2LA4) was calculated from 1968 NOE-derived distance restraints and 129 angular restraints obtained from the analysis of selected 13C chemical shifts (Table 1). Excluding the first ten residues of the construct and the last two, which can be regarded as disordered, the structure (Figure 2) is of high quality with very low RMSD values across the 20-confomers-ensemble (0.38 Å for backbone atoms and 0.85 Å for heavy atoms). Within the folded region, 83% of the side chains of residues different from Ala, Gly and Pro are ordered (RMSD≤±30° for the dihedral angle χ1) and ten of them are also well-ordered at level of angle χ2.
Pub1p C-terminal domain adopts a typical α/β RRM-type fold with a β1α1β2β3α2 topology , , , , but with some key features in comparison with other RRM domains. First, the β-sheet is distorted in Pub1p R3, and the last strand β4 is missing. Instead there is a short β-hairpin (400–405) right after the helix α2 (386–396), previously reported in other RRM structures  and, more importantly, a non-canonical N-terminal α-helix (labelled here as α0: 326–333). This novel helix is stabilised by extensive contacts with helix α2 and the C-terminal residues (Figure 2C), chiefly Trp 410, which is at the centre of a hydrophobic cluster (Val 329, Ile 332, Ile 333, Ala 336, Pro 337, Val 340, His 386, Ala 390 and Ile 393) that defines the orientation of the new helix. The side chain imino proton of this residue forms a H-bond with the carbonyl group of Ile 332 that drives Trp 410 out of its normal position as part of the β-sheet, causing a local distortion of it and hindering the formation of the fourth strand. Consistent with its structural role, Trp 410 is strictly conserved across different fungal Pub1p proteins (Figure 2D), Ngr1p, Nam8p (two Pub1p homologues in yeast), TIA-1 and TIAR proteins. The sequence alignment along these proteins allowed us to identify a conserved motif (Trp-Gly-[Arg/Lys]) which is the hallmark of the family.
We pursued a thorough study of the role of Trp 410 by analysing the properties of a set of 11 mutants including replacements by aromatic (Tyr, Phe and His), hydrophobic (Leu, Ile and Pro), polar (Asn, Ser, Thr) and small (Ala and Gly) side chains. All mutants exhibited poor solubility when over-expressed at 37°C, a situation that is partially alleviated when expressed at 20°C (Figure S1A), however the correlation with the amino acid type is unclear. The poor solubility hindered the isolation and purification of these proteins and some of the soluble ones at low temperature turned to be highly unstable during the purification (i.e. Trp410Ile). We were only able to obtain enough quantities to perform further biophysical studies for the Trp410Phe mutant. The comparison of the melting curves of the wild-type and mutant proteins followed by circular dichroism shows that the later is about 10°C less stable (Figure S1B). The large decrease in ΔHm together with the more shallow profile of the curve is indicative of a less co-operative unfolding mechanism for the mutant, perhaps attributable to the absence of the N-terminal helix. The overall conclusion of this study is that the Trp 410 position is critical for protein solubility and stability. The buried hydrogen bond Trp 410 Nε1-Ile 332 CO seems to be the key interaction that enforces a tryptophan side chain at this position.
To summarize: the C-terminal (Trp-Gly-[Arg/Lys]) motif, the distorted three-stranded β-sheet and the new N-terminal helix represent novel features that are likely conserved in the RRM3 domains of Pub1p/TIA-1 proteins and represent the hallmarks of a new class of RRMs that we propose to name TRRM (TIA-1 C-terminal domain like RRM).
Pub1p binds poly(U) sequences in vitro  and we decided to investigate which regions of the protein are responsible for this activity and specificity. First we explored the RNA binding of the Pup1p R12 construct by NMR titration experiments (Figure 3). We tested three different probes: U14, A14 and (UA)7, which have very similar composition to that of the RNA motifs found in the UTRs of Pub1p-bound mRNAs . The U14 probe binds with high affinity in the slow-exchange regime, a behaviour typical of protein-RNA complexes with dissociation constants up to 250 nM . Therefore, our data demonstrate that the first two RRMs are sufficient to account for the polyuridine binding activity of Pub1p . In contrast, A14 probe titrates in the fast chemical shift exchange regime, typical of protein-RNA complexes with affinity higher than 15 µM . However the chemical shift perturbations are negligible (Figure 3) at equimolecular ratio (100 µM proteinRNA), indicating that the bound form is scarcely populated at the working concentration (i.e. KD>100 µM). Finally, the (UA)7 probe titrates in the intermediate chemical shift time scale which is characterized by signal disappearance (by exchange broadening) upon titration and even at stoichoimetric conditions. This behaviour is typical of protein-RNA complexes with KD ranging between 400 nM and 2 µM .
Mapping of chemical shift perturbations (CSP) onto the X-ray structure of Pub1p RMM1-RMM2 construct (residues 75–240)  reveals that both RRM participate in RNA recognition (Figure 3C). In this case, unlike other well-known examples , , , , it is not clear whether the linker region participates in the interaction, as the assignment of these residues remains elusive in the complex. The largest changes are concentrated on the canonical recognition interfaces (β-sheets), hence we might expect a recognition mode similar to other RRM-containing proteins that bind polyuridine tracks , , , . Strikingly, the CSP effect propagates beyond the boundaries defined for the X-ray construct as we could detect RNA-dependent changes up to the residue Ala 63 (Figure 3A). This evidences the importance of intrinsically unstructured regions like this (residues 63–74) in ligand binding. Remarkably, Pub1p fungal orthologues show a totally conserved tetrapeptide motif (GGRE in positions 69–71) in this region. Although the way it participates in protein-RNA recognition remains as an open question.
We also investigated RNA binding by Pub1p R3 through 1H-15N HSQC-based NMR titration experiments with several RNA probes (Table 2). Pub1p R3 forms complexes in the chemical shift fast exchange regime for uracyl-free RNA probes (i.e. A14, (AG)4, (AC)4 and C8) and in fast exchange with line broadening, or “insufficient fast exchange” , for uracyl-containing RNAs [U10, U12, U14, (UA)6, (UA)7 and (CU)6] (Figure 4). The first group of complexes display binding affinities around 70 µM or weaker (Table 2). KD values were calculated by a global fitting of the NMR signals with Δδav>0.1 ppm (Equation 1) with a non-linear least squares protocol. The same fitting procedure was followed for the second group of RNA probes but excluding the crosspeaks experiencing “insufficient fast exchange” for which equation 2 (see Materials and Methods) can not be applied . The U-rich (U14, U12, U10) and UA-rich series [(UA)7, (UA)6] display KD values around 1-order of magnitude tighter than (CU)6 and at least two-orders tighter than the A-rich sequences of the first group (Table 2). Furthermore we found that U14 and (UA)7 form 21 protein-RNA complexes whereas U12 and (UA)6 form 11 complexes. Thus, we can conclude that Pub1p R3 binds an RNA stretch no longer than 6 bases, close to the longest ssRNA bound by an isolated RRM domain .
The three-dimensional distribution of the RNA-induced CSP is similar for all the RNA probes (Figure 4B), with the exception of Asn 347 side chain (Figure 4A). The Nδ2-Hδ21 and Nδ2-Hδ22 peaks suffer variations between 0.2 and 0.4 ppm for U14, (UA)7 and (CU)6 complexes, but hardly any change for the A14 one. Residue Asn 347 is in the β1-α1 loop next to Phe 351, that also changes in the uracyl-containing RNA probes. Hence we propose that this loop contains an uracyl recognition hot spot in the protein, with Asn 347 side chain probably making specific contacts with the base moiety. A hydrogen bond between Asn 347 Hδ2× and the O4 of the uracyl would explain base specificity at this position, with Phe 351 making stacking interactions to the base. Interestingly, neither Asn 347 nor Phe 351 are conserved in TIA-1 proteins (Figure 2D), thus a different RNA selectivity might be expected for these proteins. The C-terminal tetrapeptide Gly 411-Lys 412-Glu 413-Arg 414 experiences very large CSP in the complexes and are surely involved in RNA binding (Figure 4B and 4C). The remaining changes are localized in the β-sheet, the usual RNA docking platform in most RRMs . Interestingly, neither Trp 410 nor α0 helix residues suffer significant perturbations, thus we can conclude that this novel element is not participating in RNA recognition.
Pub1p R3 binds (UA)7 probes with low micromolar affinity, which is the typical affinity range displayed by single RRM domains recognising single strand RNA fragments with some degree of specificity , , but still weaker than those displaying a high degree of specificity . The recently published structure of CUG-BP1 RRM3  is probably a close model of Pub1p R3 binding mode, as this protein recognises a related sequence (UG)3 with similar affinity (KD=1.3 µM) than Pub1p R3.
In conclusion, we found that Pub1p contains two separated RNA recognition platforms (RRM1-RRM2 and RRM3) which bind U-rich and UA-rich sequences with higher affinity than A-rich ones. The recognition of 3′-UTR by Pub1p might be commanded by the N-terminal region (which binds RNA U-stretches with greater affinity) with the C-terminal one playing an auxiliary role in the selection of the transcripts.
The TRRM domain has been exclusively found in Pub1p, TIA-1 and TIAR proteins, suggesting that it may play a unique role in the biological function of this family of proteins. Proteomic studies have identified Pub1p-binding proteins, which includes eIF4G yeast homologues Tif4631p, and Tif4632p , . The Tif4631p/Pub1p interaction has not been studied in detail yet, however it has been proposed to be involved in translation control of the MFA1 transcript in response to carbon source changes . Additionally, the Pub1p and Tif4631p are essential components of the EGP-bodies generated upon glucose starvation , robust heat shock stress granules  and their homologues (TIA-1/eIF4G) are constituents of mammalian stress granules , , , , . Tif4631p is a scaffold protein with a C-terminal part that contains eIF4A- and eIF4E-interacting domains (Figure 5A). Structures of these complexes have been reported , . Much less is known for the N-terminal part that is predicted to be intrinsically unstructured and includes the Pabp1 binding region . We overexpressed a Tif4631p construct corresponding to this region (residues 1–402), obtaining a soluble protein that, as previously shown , migrates at apparently higher molecular weight in the SDS-PAGE gels. Nevertheless, the protein identity was confirmed by mass spectroscopy and the recombinant protein shows the typical 1H-15N HSQC NMR spectrum of an intrinsically unstructured protein, sharp and very low dispersed signals (Figure 6). However, the existence of regions with some local secondary structure propensity cannot be entirely ruled out. First, we tested if Tif4631p (1–402) interacts with any of our Pub1p constructs by 1H-15N HQSC titration experiments, mixing unlabelled Tif4631p protein with 15N-labelled Pub1p R3 or Pub1p R12 in approximately 21 molar ratio. The comparison of the CSP for the two Pub1p constructs (Figure 5B) showed that Tif4631p (1–402) interacts specifically with Pub1p TRRM domain. The magnitude of the CSP is small but non-randomly distributed, suggesting that the interaction is weak but specific. Similar weak but specific effects are observed in the NMR titrations in the human eIF4A/4G/4H helicase complex . The Tif4631p binding site is formed by residues of strand β3, helix α1 and the linker between them (Figure 5C) and includes conserved Phe 364 and Phe 366 (in fungi). The side chain amide groups of Gln 362 and Asn 363 show significant CSP upon binding (Figure 5C), maybe because make specific interactions across the interface (i.e. hydrogen bonds). Among these four residues only Phe 364 is conserved in TIA-1, thus if an equivalent interaction with eIF4G exists it is likely to be topologically different.
Next, we analysed the perturbations on the Tif4631p NMR spectra by mixing unlabelled Pub1p R3 in a 21 molar excess with 15N-labelled Tif4631p (1–402) (Figure 6). This approach provides a complementary view of the interaction by showing the effect on Tif4631p. The high overlapping of Tif4631p 1H-15N HSQC spectrum masks most of the expected changes, although the lower-left crosspeak of the spectra shows changes upon binding (Figure 6B). Because of its unique position, this signal could luckily be assigned to Trp 95 Nε1-Hε1, the only Trp residue of Tif4631p (1–402). This evidences that Pub1p R3 either contacts this residue or at least is in its vicinity. Even more, the CPS changes are accompanied by differential changes in the signal line widths of the individual components of the Nε1-Hε1 multiplet. In the free state all the four signals showed similar intensity and line widths, a situation typical of a highly mobile group (i.e. intrinsically disordered protein) (Figure 6A). However, upon binding, all the components suffer line broadening, which is higher for the left components of the multiplet. This differential effect is attributable to the interference of NMR relaxation mechanisms (dipole-dipole and chemical shift anisotropy), popularly known as the TROSY effect . This change on the relaxation properties of the Trp signal undoubtedly shows that this side chain experiences a reduction in mobility attribuitable to the association with Pub1p R3. NMR data, CSP and the changes in line width, demonstrate that Pub1p R3 binding site overlaps with Tif4631p Box1 (Figure 5A), a highly conserved region in Saccharomyces eIF4G homologs  close to the Pab1p binding site.
To complement the NMR analysis, we decided to perform independent biochemical experiments. First, we chose affinity capture experiments by GST tagging either Tif4631p or Pub1p constructs, but we faced technical problems due to resin-induced protein precipitation for some Pub1p constructs, that interfere with the outcome of the experiment. As an alternative, we performed in vitro protein-protein cross-linking experiments using glutaraldehyde. This time, the large number of potentially reactive groups in Tif4631p (1–402) (48 Lys side chains) resulted in a large amount of unspecific reactions (self cross-linking mainly). We optimized the reaction by keeping protein concentrations below 10 µM and adjusting the Tif4631p/Pub1p ratio to the ratio of Lys on each protein (see Materials and Methods for further details). We also followed an adaptation of a vapour diffusion method that ensures a slow addition of the cross-linking agent  and minimizes unspecific reactions. The formation of the corresponding adducts was monitored by PAGE-SDS. Three different Tif4631p constructs were used: the original Tif4631p (1–402) and two shorter forms Tif4631p (1–184) and Tif4631p (1–82). On the other side, we used the Pub1p R12 and Pub1p R3 constructs and a double mutant of Pub1p R3 Phe364Ala/Phe366Ala that targets key residues of Tif4631p binding site identified by NMR. The results are summarized in Figure 7 and confirm the conclusions obtained from the NMR analysis. Tif4631p (1–402) and (1–184), that contain Box1, form adducts with Pub1p R3, but not with Pub1p R12 or with Pub1p R3 double mutant. The lost of binding in the mutant is highly revealing as demonstrates the direct involvement of Phe 364 and Phe 366 in the interaction. An alternative explanation arguing that mutations cause Pub1p R3 unfolding could be plausible, but the high resemblance between Pub1p R3 mutant and wild-type NMR spectra show that their structure must be similar (Figure S2). Unfortunately, Tif4631p (1–82) can not be stained with silver (Figure S3B, lines 3 and 6) and is only visible by negative staining on overexposed silver stain gels (data not shown). Nevertheless, none of the Pub1p variants form adducts with this construct, showing that the so-called RNA1 region in Tif4631p  (Figure 5A) does not interact with Pub1p.
In conclusion, our study about the molecular recognition properties of Pub1p shows a very versatile protein capable to recognise Tif4136p and RNA using structurally compatible interfaces (Figure 4C and and5C).5C). However, there are novel structural features discovered, such as the conserved N-terminal helix in Pub1p R3, whose function is still poorly understood, opening new avenues for future research.
ELAV-like HuR and TIA-1/TIAR proteins have been proposed to be the metazoan homologues of Pub1p on the basis of its similar functional role in mRNA metabolism. However, the structure of Pub1p R3 shows novel sequence features (the Trp-Gly-[Arg/Lys] motif), which are present in TIA-1 proteins but not in HuR ones, supporting that TIA-1/TIAR are the true evolutionary relatives of Pub1p. An unpublished structure of the third domain of TIAR protein has been deposited in the PDB data bank (ID 1X4G) and it also contains the novel N-terminal helix constituting, together with our structure, the only two cases of RRMs having this non-canonical secondary structure element. We propose to name this new class of domains as TRRM (TIA-1 C-terminal domain like RRM). The conservation of this structural feature might be correlated with a biological function common in TIA-1 and Pub1p proteins. In metazoans, TIA-1 is involved in stress granules formation , ,  and in alternative splicing , , , . In yeast, Pub1p and Ngr1p (TIAR homolog) are components of different cytoplasmic RNA granules (recently reviewed in ) while Nam8p is involved in RNA splicing , . Although the level of conservation of the Trp-Gly-[Arg/Lys] motif is paramount among the Pub1p/TIA-1 family, Ngr1p and Nam8p sequences lack the characteristic amphipatic profile in the region corresponding to the hypothetical N-terminal helix (Figure 2D), suggesting that this element is not present in these proteins. We have attempted to overexpress the TRRM domain of Ngr1p but it proved to be highly insoluble (data not shown), maybe because the absence of the N-terminal helix exposes the hydrophobic cluster beneath it. Similarly the destabilization of the same region in the Pub1p Trp410 mutants (Figure S1) could also decrease the solubility, perhaps due to a similar effect.
Pub1p is not the only RNA binding protein found in cytoplasmic RNA granules that contain non-canonical RRM fold. The second RRM of TDP-43, a component of some types of mammalian stress granules , , , has an additional β-strand in the β-sheet), which is involved in protein dimerization . The TDP-43 dimer interface includes helix α1 and self-recognition is guided by hydrogen bonds. It could be possible that Pub1p novel N-terminal helix functions similarly, however we have not found evidences by NMR, gel filtration and analytical ultracentrifugation of such behaviour. Moreover, the cross-linking experiments, which are very efficient in capturing transient interactions, did not detect the presence of a Pub1p TRRM dimer (compare lines 1 and 4 in Figure S3A).
A detailed structural analysis, in combination with the alignment of TIA-1/Pub1p TRRM sequences, identifies conserved residues in the opposite face to the RNA interface (Figure 8). The backside of Pub1p TRRM is shaped by the contacts between the three helices of the structure, which define a negatively charged interface with two grooves and three ridges (Figure 8). Two conserved residues (Asp 330 and Glu 387) characterize the shallow groove (not present in previous RRM structures) defined by helix α0 and helix α2; with the hydrophobic Pro 338 placed at the bottom of the imaginary valley. His 386 side chain, also in the region, is allowed to change to Arg or Lys, conserving a positive character that seems to be key for the π-cation interaction with Trp 410 and hence for protein stability (Figure 2C). The second groove, defined by helix α1 and helix α2, is wider and deeper and contains conserved hydrophobic side chains (Figure 8) and one or two negatively charged side chains on helix α1. An structurally equivalent region is involved in recognition of Trp-containing peptides in U2AF homology motifs (UHM), a well-known family of RRMs capable to interact with proteins and RNAs simultaneously  (Figure S4). In a more recent example, the same groove is used for protein FIR to recognise an α-helical segment of FUS through a novel protein-protein binding mode  (Figure S4). Like in the later case, Tif4631p interacts with Pub1p TRRM by using the α1/α2 groove, but also the linker connecting α1 and β3 and part of β3, suggesting a possible novel mechanism of RRM-mediated protein recognition. Our study identifies Box1 as the Pub1p interacting region in Tif4631p. This peptide motif contains a totally conserved tryptophan in Saccharomyces eIF4G homologs ( and Figure 5A), therefore the interaction might partially resemble that of the UHM. Interestingly, Pabp1 does interact with Tif4631p through its RRM2 domain , , using an interface similar to that of Pub1p TRRM. In this case, Pab1p targets the conserved peptide Box3 in Tif4631p, which neighbours the Pub1p binding site (Figure 5A). The structure of the Pab1p/Tif4631p complex has not been solved, however the recognition of mammalian eIF4G1 by the rotavirus NSP3 protein, a functional homolog of PABP1, provides solid evidences on the binding mode . Protein-protein recognition would involve two conserved residues in eIF4G1 (Ile 140 and Ile 142) that presumably pack against Phe 119 and Phe 122 of PABP1 RRM2. Equivalent residues can be found in Pab1p RRM2 and Tif4631p Box3 suggesting that the binding mode is conserved across species. We have experimentally proved that Pub1p Phe 364 and Phe 366 (structurally equivalent to PABP1 Phe 119 and 122) are involved in recognition (Figure 5C). Tif4631p Box1 does not contain conserved Ile residues, but rather has a series of aromatic side chains (Figure 5A) that would presumably participate in Pub1p binding. All these observations suggest that Pub1p and Pab1p share a similar Tif4631p recognition mechanism. Giving the high level of conservation of Box1 in Tif4632p, it is reasonable to think that Pub1p recognises it in a similar fashion.
According to chemical shift perturbation data, Pub1p TRRM might be able to recognise Tif4631p and RNA specifically and simultaneously. The Pub1p-binding box in Tif4631p is close to the Pab1p binding site, suggesting the possibility that both proteins could work cooperatively to recognise more than one sequence motif in the UTRs. Alternatively, it could be possible that Pub1p recruitment to the mRNPs was commanded by the interaction with Tif4631p, in an RNA-independent manner. We find interesting to discuss about the biological implications of these new structural findings and to see the degree of agreement with current models about different aspects of mRNA metabolism. Pub1p is loaded to the nuclear mRNPs and then shuttled to the cytoplasm, where is part of the fraction of translationally inactive mRNPs . The latter suggests that the protein is probably removed from the mRNPs at some stage of the translational initiation , or perhaps later during the pioneer round of translation , although this issue has not been yet experimentally addressed. Different types of stressors cause the interruption of the normal translation initiation flow path, causing the accumulation of pre-initiation complexes (PIC) that eventually aggregate into different types of cytoplasmic granules. It is known in yeast and mammals that Pub1p/TIA-1 proteins are components of some of these granules ( and references herein). In mammals, the most likely mechanism of TIA-1 mediated SG nucleation involves its C-terminal prion-like region , . Pub1p also contains an equivalent C-terminal region and additionally a Met/Asn-rich region between RRM2 and RRM3 that could play a similar role. The accumulation of stalled PICs at a stage that they still contain TIA-1/Pub1p, would trigger their aggregation into RNA granules (e.g. EGP-bodies or SG) above a certain concentration threshold.
Pub1p promotes stabilization of mRNAs against ARE-mediated deadenylation decay and NMD pathways, which occur in the P-bodies. Thus, it is likely that Pub1p binding prevents (or delays) the localization of a subset of mRNPS into P-bodies. Hence it seems important to unravel which are the determinants of Pub1p recruitment to these transcripts. A genome-wide study identified 368 cellular transcripts bound to Pub1p and, among them, 53 whose stability is decreased in the absence of Pub1p . Most of these mRNAs encode proteins that act in ribosomal biogenesis and cellular metabolism. Furthermore, after a re-examination of these data we have found that the Pub1p-bound dataset is enriched in transcripts derived from intron-containing genes (48 out of 368). Surprisingly the percentage increases up to 40% in the 53-subset (transcripts which are simultaneously bound and stabilized by Pub1p), suggesting that splicing modulates Pub1p-mediated stabilization, probably by promoting its recruitment to the nascent transcript. TIA-1 is a component of metazoan spliceosome , , , ,  but this biological function is performed (at least partially) by Nam8p in yeast , rather than by Pub1p. We propose three possible mechanisms of Pub1p recruitment to the nascent mRNP in a splicing-dependent manner: binding to known spliocesome components (i.e. Lms2p, Lms4p, Sto1p, Tif4631p) , replacement of Nam8p and specific recognition of RNA trans-acting elements in introns or exon junctions (typically U-rich tracks).
Apart from the splicing-dependent mechanism there should be an alternative Pub1p recruitment mechanism to account for transcripts derived from intronless genes. We agree that such procedure might depend on the cooperative recognition of RNA regulatory elements in the UTRs . A-rich, U-rich and UA-rich motifs are overrepresented in the 3′- and 5′-UTRs of Pub1p-bound mRNAs . Since Pub1p has low affinity for A-rich oligonucleotides, these authors proposed that additional RNA-binding proteins could act in coordination with Pub1p to achieve A-rich recognition. If Pub1p interacts with Tif4631p in vivo through a similar mode to that described here, transcripts containing A-rich and U-rich motifs could be recognised in a cooperative manner by Pub1p and Pab1p thanks to the simultaneous interaction of both proteins with neighbouring sites in Tif4631p. The model does not exclude the participation of other RBPs shuttle proteins like Hrp1p, Npl3p, Hrb1p or Gbp2p, some of which can be tethered to eIF4G, in a similar way than Pub1p and Pab1p does, to Tif4631p.
DNA fragments corresponding to wild-type constructs were amplified from Saccharomyces cerevisiae genomic DNA (Novagen) by PCR using high fidelity DNA polymerase KOD (Novagen) and specific DNA primers (Sigma, IDT). The products were purified, digested with the corresponding restriction enzymes and ligated into a home-modified pET28 (Novagen) vector that contains: an N-terminal thioredoxin A fusion tag, for enhanced expression and solubility, an internal 6×His tag for affinity purification and a TEV protease site, for fusion tag removal. Three constructs of S. cerevisiae Pub1p protein were obtained: Pub1p R13 that spans over three RRM domains (residues 32–414), Pub1p R12 containing domains RRM1 and RRM2 (residues 32–242) and Pub1p R3 that comprises domain RRM3 (315–414). The three Saccharomyces cerevisiae Tif4631p constructs (residues 1–402, 1–184 and 1–82) were cloned following equivalent methods.
The Trp410 library of mutants was generated in the Pub1p R3 framework by using the QuickChange site-directed mutagenesis® kit (Stratagene) and a DNA primer with a random base composition of codon 410 (5′-G GTG CTC GAG TTA TCT TTC CTT ACC RNN ACC GGT TCT CAA GTT TC-3′; R=A or G and N=A,G,C or T). Pub1p R3 Phe364Ala/Phe366Ala double mutant was generated using an equivalent protocol.
Plasmids corresponding to wild-type and mutant proteins were transformed in E. coli BL21(DE3) (Novagen) chemically competent cells and expressed in kanamycin (Sigma-Aldrich) containing (30 µg/l) LB medium. For isotopic labelling, a K-MOPS derived minimal medium  was supplemented with 15NH4Cl (1 g/l) and/or 13C-glucose (4 g/l). Bacterial cultures were typically grown at 37°C until OD600=0.6, transferred to a shaker at 20°C and induced overnight with 0.5 mM IPTG (Sigma-Aldrich) upon temperature stabilization. Cells were harvested by centrifugation (30 minutes at 3000 g), resuspended in buffer (20 mM potassium phosphate pH 8.0, 300 mM NaCl, 0.1% β-mercaptoethanol) containing protease inhibitors (Roche) and lysed by French-press (Thermo Scientific) or sonication. Lysates were pelleted at high speed (30 minutes at 15000 g) to separate cell debris and other insoluble components. Wild-type protein constructs (Pub1p R12, Pub1p R3 and Tif4631p) and Pub1p mutants were first purified by metal affinity chromatography (HisTrap™ HP 5 ml (GE Healthcare)). Fractions coming from this column that contain the fusion protein were pooled together, dialysed (overnight) at 4°C against 20 mM potassium phosphate pH 8.0 buffer (plus 0.1% β-mercaptoethanol) and simultaneously digested with homemade TEV protease (100–200 µg/ml). After complete digestion, the samples were re-applied to the nickel column and the target proteins were collected in the flow through. In the final step, proteins were purified and concentrated by ion-exchange chromatography: Pub1p R12 and Tif4631p with an anion exchange column (MonoQ 5 ml, GE Healthcare) and Pub1p R3 with a cation exchange one (SP 5 ml, GE Healthcare). In either case, proteins were eluted with a linear salt gradient (from 0 to 1 M NaCl) in a buffer containing 20 mM Tris-HCl pH 8.0 and 1 mM DTT. Pub1p mutant proteins were purified with similar methodologies than wild-type proteins. Sample purity and homogeneity was checked by PAGE-SDS and mass spectrometry.
Protein samples for NMR spectroscopy were approximately 500–800 µM for Pub1p R12 and 300 µM for Pub1p R3 in 90% H2O, 10% D2O, 0.1% sodium azide containing 25 mM potassium phosphate buffer (pH 6.5–7.0), 25 mM NaCl, and 0.5 mM DTT. All experiments were recorded in Bruker AV600 and AV800 NMR spectrometers equipped with cryoprobes. Sodium 2,2-dimethyl-2-silapentane-5-sulphonate (DSS) was used as an internal chemical shift reference. The assignment of the backbone 1H, 15N and 13C atoms was achieved by following the standard methodology. The 3D experiments HNCO, HNCA, HN(CO)CA, CBCA(CO)HN and CBCANH , recorded in the AV800 spectrometer, were manually assigned for Pub1p R12. An equivalent set of experiments, plus HN(CA)CO , was acquired for the Pub1p R3 construct, and the spectra were assigned by following a supervised automatic assignment protocol . 3D HC(C)H-TOCSY and (H)CCH-TOCSY experiments  were recorded to assign the side chain resonances of Pub1p R3 construct. NMR spectra processing was done with NMRPipe  whereas spectral viewing and analysis were done with the program CcpNmr Analysis . The completeness of the assignment for the Pub1p R3 construct extends to 95%, while only backbone 13C/15N resonances were assigned for the Pub1p R12 construct.
For Pub1p R3, a total of 1968 NOE-derived distance constraints were extracted from the 2D-NOESY  and 15N-edited-3D-NOESY  spectra, both with mixing times of 80 ms. Additional 129 dihedral angular constraints were extracted from the analysis of 13C (Cα ,C′ and Cβ) chemical shifts  and incorporated to the structure calculation. The structure calculation was performed by following an standard torsion angle dynamics protocol implemented in the program CYANA 2.1 . Coordinates of 50 conformers were randomly generated and then subjected to the restrained simulated annealing protocol. The 20 lowest target function conformers with no distance violations larger than 0.2 Å and no angle violations larger than 5.0° were selected as representative of the structure of the protein (PDB code: 2LA4 and BMRB code: 17502). MOLMOL  and PyMOL v0.98 (DeLano Scientific LLC, Palo Alto, CA, USA) molecular graphics packages were used for the analysis and representation of the structures.
All RNA probes were chemically synthesized and purchased from IDT (Integrated DNA technologies). All samples for NMR titration studies were prepared by simultaneously dialyzing against the same protein buffer (see above for other NMR experiments). Samples were prepared without D2O in a 4 mm NMR tube (typically 300 µl) and then inserting this into a 5 mm tube that contains D2O for referencing the lock signal. In the case of Pub1p R12, 1H-15N HSQC experiments were recorded at 11 proteinRNA ratio (100 µM concentration each) for the probes: U14, A14 and (UA)7. Assignment of the spectrum of the A14 complex was made by simple comparison with the free protein spectrum, whereas the Pub1p-U14 complex spectrum was assigned with the aid of triple resonance experiments (3D HNCA and CBCA(CO)HN) acquired with a concentrated 11 sample (500 µM). Pub1p R3 RNA binding affinity and selectivity was investigated with eleven RNA probes: U14, U12, U10, A14, (UA)7, (UA)6, (CU)6, (AG)4, (AC)4, (GU)4 and C8. Experimental series were performed starting at the highest RNAprotein ratio (typically 400 µM100 µM) and the following points prepared by dilution of this sample with a stock of 15N labelled protein at the same concentration and buffer composition, thus the protein concentration is kept fixed. Six to eight titration points were typically recorded for each RNA probe at proteinRNA ratios ranging between 41 and 0.031. The weighted averaged 1H and 15N amide chemical shift changes were calculated according to the following formula:
The Δδav values measured in experiments at constant protein concentration (P0) were plotted versus the total ligand concentration values (L0) and fitted to the following equation to obtain the maximum chemical shift change (Δδmax) and the dissociation constant (KD) :
Because this equation applies only in the fast exchange limit , 1H-15N HSQC peaks experiencing line broadening during the titration were excluded from the KD calculation. The remaining peaks showing Δδav>0.1 ppm between the free and the RNA-bound protein at the highest RNA/protein ratio were fit globally (KD global parameter; Δδmax local parameter for each curve) to equation (2) by a non-linear least squares method with the software package GraphPad Prism version 5.00 for Mac (GraphPad Software, San Diego California USA, www.graphpad.com).
Samples for the protein constructs Pub1p R12, Pub1p R3 and Tif4631p (1–402) were dialyzed simultaneously against 25 mM potassium phosphate buffer (pH 7.0, 25 mM NaCl, 0.1 mM DTT) for 24 hours at 4°C twice, to guarantee no differences in buffer composition. Binding was monitored on the 1H-15N HSQC spectra of Pub1p R12 and Pub1p R3 upon titration with different amounts of unlabeled Tif4631p (1–402). Pub1p constructs were prepared at 100 µM concentration and Tif4631p (1–402) was added at different ratios: 12 for the Pub1p R12-Tif4631p (1–402) complex and 11.9, 10.8 and 10.4 for the Pub1p R3-Tif4631p (1–402) one. The chemical shift perturbations were calculated as weighted average of 1H and 15N amide chemical shift changes (equation (1)). A complementary NMR titration experiment was done by monitoring changes in the 1H-15N HSQC spectrum of 100 µM 15N-labelled Tif4631p (1–402) upon addition of 120 µM unlabelled Pub1p R3. All the NMR experiments were recorded at 25°C in a Bruker AV800 spectrometer with a cryoprobe unit.
Pub1p-Tif4631p interaction was probed by glutaraldehyde cross-linking using a variation of the vapor diffusion method . Sample volumes of 200 µL were prepared by mixing different amounts of Pub1p and Tif4631p constructs in 25 mM potassium phosphate, 150 mM NaCl and 0.1 mM DTT buffer (pH 8.0) on ice. Pub1p proteins (Pub1p R3, Pub1p R12 and Pub1p R3 Phe364Ala/Phe366Ala mutant) were kept at 10 µM final concentration in all the experiments, Tif4631p (1–402) at 1 µM and Tif4631p (1–184) at 4 µM. The concentration of the Tif4631p proteins was experimentally optimized in an attempt to compensate the larger number of potentially reactive side chains in these proteins (41 and 13 Lys respectively) in comparison with Pub1p R3 (4 Lys) and Pub1p R12 (12 Lys) constructs. The different protein mixtures were loaded in the buffer reservoirs of a 24-well sitting-drop plate (Hampton Reserch), the upper reservoir was filled with 20 µL of 0.5 M aqueous glutaraldehyde solution, the plate sealed with film and incubated 16 h at 4°C allowing a smooth diffusion of the cross-linker. Upon reaction completion, the glutaraldehyde drops were carefully removed and 10 µL of 2 M Tris-HCl (pH 8.0) were added to the reaction mixture to quench the traces of reactant in the protein mixture. Sample (10 µl) components were separated by PAGE-SDS electrophoresis (15% gel) and revealed by silver stain methods. Reaction mixtures were further characterized by MALDI-TOF MS.
All CD experiments were recorded in a Jasco J-810 spectropolarimeter equipped with a Peltier temperature control unit. Thermal denaturation experiments were carried out in a 25 mM saline potassium phosphate buffer (pH 7.5–8.0) at protein concentrations of approximately 40 µM. Cell path lengths of 0.1 cm (Pub1p R3 wild-type) and 0.5 cm (Pub1p R3 Trp410Phe) were used. Ellipticity was recorded at 220 nm at a temperature range of 20–95°C for wild-type and 20–80°C for mutant protein and at heating rates of 60°C/hour and 30°C/hour respectively.
Protein solubility and stability of Trp410 mutants. (A). Summary of the protein expression test performed on various Trp410 single mutants in E. coli BL21 (DE3) strains. All proteins were expressed as thioredoxin fusions and gels show the levels of total and soluble fractions obtained after induction of protein expression at two temperatures. (B) Melting curves of wild-type (left) and W410F mutant (right) proteins monitored by circular dichroism (arbitrary ellipticity units).
Pub1p R3 Phe364Ala/Phe366Ala mutant maintains the 3D structure of the wild-type protein. The comparison of the aromatic region of the 2D TOCSY spectra of the two proteins shows most of the Tyr and Phe side chains at very similar positions, indicative of their structurally similar environments. The residue Phe 361, spatially close to the mutations, suffers the largest perturbation in this region.
Control cross-linking reactions of individual protein constructs used in this study. (A) Silver-stained 15% PAGE gel comparing glutaraldehyde treated and untreated Pub1p constructs. Conditions are the same as in Figure 7 and in Materials and Methods. (B) Same comparison for the three Tif4631p constructs. The approximate position of the Tif4631p (1–82) band was indicated with an asterisk. This constructs was not stained with silver and is only visible (by negative staining) in heavily stained gels (data not shown).
Several examples of RRM mediated protein-protein recognition modes. (A) Tif4631p recognition interface of Pub1p (shaded in green), (B) complex between FBP (green) and FBP-interacting repressor (FIR) (in grey) and (C) complex between SPF45 UHM domain (grey) and a SF3b-155 peptide (green).
We would like to acknowledge to other members of the Biological Physical Chemistry Department, in particular to Dr. María Gasset for her fruitful comments and Mss. Diana Velazquez for technical assistance during this research.
Competing Interests: The authors have declared that no competing interests exist.
Funding: This work was supported by C.S.I.C. (http://www.csic.es/) Special Intramural project [200880I088], MICINN (http://www.micinn.es/) project [CTQ2008-00080/BQU] and “Comunidad Autónoma de Madrid” (http://www.madrimasd.org/) project [CCG07-CSIC GEN2333]. SML is holder of an undergraduate contract from the “Consejeria de Educacion de la Comunidad Autónoma de Madrid” (http://www.madrimasd.org/) and the European Social Fund (http://ec.europa.eu/employment_social/esf/index_en.htm). YM is holder of a Predoctoral “Junta de Ampliación de Estudios (JAE)” contract (http://www.csic.es/web/guest/programa-jae). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.