|Home | About | Journals | Submit | Contact Us | Français|
Tpa1 (for termination and polyadenylation) from Saccharomyces cerevisiae is a component of a messenger ribonucleoprotein (mRNP) complex at the 3′ untranslated region of mRNAs. It comprises an N-terminal Fe(II)- and 2-oxoglutarate (2OG) dependent dioxygenase domain and a C-terminal domain. The N-terminal dioxygenase domain of a homologous Ofd1 protein from Schizosaccharomyces pombe was proposed to serve as an oxygen sensor that regulates the activity of the C-terminal degradation domain. Members of the Tpa1 family are also present in higher eukaryotes including humans. Here we report the crystal structure of S. cerevisiae Tpa1 as a representative member of the Tpa1 family. Structures have been determined as a binary complex with Fe(III) and as a ternary complex with Fe(III) and 2OG. The structures reveal that both domains of Tpa1 have the double-stranded β-helix fold and are similar to prolyl 4-hydroxylases. However, the binding of Fe(III) and 2OG is observed in the N-terminal domain only. We also show that Tpa1 binds to poly(rA), suggesting its direct interaction with mRNA in the mRNP complex. The structural and functional data reported in this study support a role of the Tpa1 family as a hydroxylase in the mRNP complex and as an oxygen sensor.
The central processes of gene expression, from mRNA synthesis to translation and degradation, are mediated by the messenger ribonucleoprotein (mRNP) complexes. These mRNP complexes consist of individual transcripts bound by a changing repertoire of proteins that are the actual target for regulation and dictate the fate of an mRNA in the gene expression pathway (1). The ‘mRNP code’ is unique for each transcript in a given cell type and changes as the primary nuclear transcript undergoes processing and export (2). The mRNA-binding proteins mediate co-transcriptional 5′-end capping, splicing, 3′ polyadenylation and quality control of mRNAs within nascent mRNP complexes (3–5). While passing through each stage of mRNA processing, an mRNP complex is successively remodeled through the loss or gain of RNPs and other proteins, and is implicated in mRNA localization, translation initiation/termination and mRNA turnover in the cytoplasm (2).
Eukaryotic mRNAs are synthesized with two integral stability determinants: a cap at their 5′-end and a poly(A) tail at their 3′-end, that are believed to provide a basal level of stability to all mRNAs by preventing their degradation by cytoplasmic exonucleases (6). The mRNA-stabilizing properties of the poly(A) tail have been linked both to its binding to the poly(A)-binding protein (PABP) (7) and to its capacity to stimulate active translation (8). The length of the poly(A) tail is regulated by polymerase-deadenylase (9) and plays an essential role in controlling the mRNA function, subcellular localization and half-life. The removal of the mRNA poly(A) tail (deadenylation process) is known to be the first step in normal mRNA decay and is the major factor that controls the rate of mRNA decay (10,11). The processes of mRNA deadenylation and degradation, and the protein complexes that are involved in these processes are evolutionarily well conserved from Saccharomyces cerevisiae to humans (12).
Tpa1 (termination and polyadenylation) of S. cerevisiae was shown to be part of an mRNP complex at the 3′ untranslated region of mRNAs (13). It associates specifically with components of the translation termination complex, and is involved in both translation termination and in the regulation of normal mRNA decay through translation termination-coupled poly(A) shortening (13). Genetic studies demonstrated that the tpa1Δ mutation led to a 1.5- to 2-fold increase in the half-lives of mRNAs degraded by the general 5′→3′ pathway or the 3′→5′ nonstop decay pathway (13). S. cerevisiae Tpa1 was found to interact with a variety of proteins involved in mRNA processing, the translation termination factors eRF1 and eRF3, and a tRNA methyltransferase TRM1 (13,14).
It was also shown that Tpa1 interacts with the PABP (13). PABP, part of the 3′-end RNA-processing complex, mediates interactions between the 5′ cap structure and the 3′ mRNA poly(A) tail, is involved in the control of poly(A) tail length, and interacts with the translation factor eIF-4G (15). PABP binds poly(A) using one or more RNA-recognition motifs (16). Nuclear PABPs are necessary for the synthesis of the poly(A) tail, regulating its ultimate length and stimulating maturation of mRNAs (17). In the cytoplasm, PABPs facilitate the formation of the ‘closed loop’ structure of the mRNP particles, which is crucial for additional PABP activities that promote translation initiation/termination, recycling of ribosomes and stability of mRNAs (17).
A comprehensive bioinformatics study revealed that S. cerevisiae Tpa1 contains an N-terminal domain (NTD) belonging to the non-heme Fe(II)- and 2-oxoglutarate (2OG) dependent dioxygenase family (Pfam03171), and a C-terminal domain (CTD) belonging to the ofd1_CTDD (2OG and iron-dependent oxygenase C-terminal degradation domain) family (Pfam10637) (18,19). Members of the Fe(II)-/2OG-dependent dioxygenase family catalyze a remarkable diversity of enzymatic reactions (20), including post-translational modification of protein side chains (as hydroxylases) and the repair of alkylated DNA/RNA (as demethylases). Some members of this family play a role as cellular sensors and regulators (21). Recently, the N-terminal dioxygenase domain of Schizosaccharomyces pombe Ofd1 protein, a homolog of S. cerevisiae Tpa1 showing sequence identity of 40% (55% similarity), was shown to be required for oxygen sensing (19,22). Members of the Fe(II)-/2OG-dependent dioxygenase family are characterized by a core consisting of eight antiparallel β-strands that adopt the double-stranded β-helix (DSBH) fold, also known as the β-strand jelly-roll fold or double Greek key motif (23,24). They bind Fe(II) through the conserved H1X(D/E)…H2 motif (20,25,26). For many members of this family, the oxidation of the primary protein substrate is coupled to the conversion of the co-substrate 2OG to succinate and CO2 (20).
Homologs of S. cerevisiae Tpa1 are present not only in other fungi but also in higher eukaryotes. Human Tpa1 shows 28% sequence identity (43% similarity) to S. cerevisiae Tpa1 (Figure 1). Despite the importance of the biological roles that have been suggested for the Tpa1 family members in yeast, further studies are required to clearly establish their molecular functions. To provide a structural basis for a better understanding of the function, we have determined the crystal structure of S. cerevisiae Tpa1 as a representative member of the Tpa1 protein family. The structure reveals that two domains of Tpa1 have the common DSBH fold. However, a dioxygenase-type active site for binding Fe(II) and 2OG is present only in the NTD of S. cerevisiae Tpa1, whereas the CTD lacks such a feature. Furthermore, our electrophoretic mobility shift assay demonstrates that Tpa1 is capable of binding poly(A), suggesting that S. cerevisiae Tpa1 directly interacts with the mRNA in the mRNP complex.
The S. cerevisiae tpa1 gene (YER049W) encoding the N-terminal truncated form (residues 21–644) of Tpa1, which is called Tpa1Δ20, was PCR-amplified and cloned into the expression vector pET-21a(+) (Novagen). The recombinant protein fused with a hexahistidine-containing tag at its C-terminus was overexpressed in Escherichia coli Rosseta2(DE3)pLysS cells using Terrific Broth culture medium. Protein expression was induced by 0.5 mM isopropyl β-d-thiogalactopyranoside and the cells were incubated for additional 18 h at 30°C following growth to mid-log phase at 37°C. The cells were lysed by sonication in buffer A—50 mM Tris–HCl at pH 7.9, 0.50 M NaCl and 10% (v/v) glycerol—containing 5.0 mM imidazole. The crude lysate was centrifuged at ~36 000g for 60 min. The supernatant was applied to an affinity chromatography column of HiTrap Chelating HP (GE Healthcare), which was previously equilibrated with buffer A. Upon eluting with a gradient of imidazole in the same buffer, the protein was eluted at 120–150 mM imidazole concentration. The eluted Tpa1 was applied to a HiLoad XK-16 Superdex 200 prep-grade column (Amersham-Pharmacia), which was previously equilibrated with 20 mM Tris–HCl (pH 7.5) and 200 mM NaCl. Fractions containing the yeast Tpa1 were concentrated to 5 mg ml–1 for crystallization using an Amicon Ultra-15 centrifugal filter unit (Millipore). The procedure for preparing the SeMet-substituted protein was the same except for the presence of 1.0 mM Tris(2-carboxyethyl) phosphine hydrochloride (TCEP) in all buffers used during purification steps besides buffer A. When overexpressing the SeMet-substituted protein in E. coli Rosseta2(DE3)pLysS cells, we used the M9 cell culture medium that contained extra amino acids including SeMet. The CTD (residues 273–644) fused with a hexahistidine-containing tag at its C-terminus was overexpressed and purified essentially as described earlier. We tried to overexpress several constructs of the NTD (residues 21–246, 21–250, 21–253 and 21–277) but all of them were expressed as insoluble.
Crystals were grown by the sitting-drop vapor diffusion method at 14°C by mixing equal volumes (2 μl each) of the protein solution (5 mg ml–1 concentration in 20 mM Tris–HCl, pH 7.5 and 200 mM NaCl) and the reservoir solution. Crystals of the native Tpa1–Fe(III) binary complex were grown using a reservoir solution consisting of 200 mM lithium sulfate, 100 mM Tris–HCl (pH 8.5) and 25% (w/v) PEG 3350 and 0.14 mM of ferrous ascorbate. They grew to approximate dimensions of 0.2 mm × 0.2 mm × 0.1 mm within a few days. Crystals of the binary complex of the SeMet-substituted Tpa1 were obtained as above for the native crystals except for the presence of 0.10 mM TCEP and 10 mM strontium chloride in the protein solution. To elucidate how Tpa1 binds its co-substrate 2OG, crystals of the native Tpa1–Fe(III)–2OG ternary complex were obtained by co-crystallization in the presence of 0.14 mM ferrous ascorbate and 7.0 mM sodium 2OG.
A crystal of the native Tpa1–Fe(III) binary complex was dipped for a few seconds into 10 µl of a cryoprotectant solution, which consisted of 20% (v/v) glycerol added to the reservoir solution. We found that an addition of ~2 µl of 50 mM ferrous ascorbate solution to the cryoprotectant solution was necessary to improve the resolution limit of the crystals. The soaked crystal was frozen in the cold nitrogen gas stream at 100 K. X-ray diffraction data were collected at 100 K using an ADSC Quantum 210 CCD detector system (Area Detector Systems Corporation, Poway, CA, USA) at the beamline NW12A of Photon Factory (PF), Japan. The raw data were processed and scaled using the program suit HKL2000 (27). The native Tpa1–Fe(III) binary complex crystal belongs to the space group P321, with unit cell parameters of a = b = 136.3 Å, c = 79.92 Å, α = β = 90° and γ = 120°. One Tpa1 monomer is present in an asymmetric unit, giving a solvent fraction of 58.2%. X-ray diffraction data from a crystal of the native Tpa1–Fe(III)–2OG ternary complex were collected essentially as described above at the BL-4A experimental station of Pohang Light Source (PLS), Pohang, Korea. The addition of 50 mM ferrous ascorbate solution to the cryoprotectant solution was unnecessary to improve the resolution limit of the ternary complex crystals. The native Tpa1–Fe(III)–2OG ternary complex crystal belongs to the space group P321, with unit cell parameters of a = b = 136.3 Å, c = 79.83 Å, α = β = 90° and γ = 120°. One Tpa1 monomer is present in an asymmetric unit, giving a solvent fraction of 58.2%.
Single-wavelength anomalous diffraction (SAD) data using a binary complex crystal of the SeMet-substituted Tpa1 were collected at 100 K on a Quantum 210 CCD area detector (Area Detector Systems Corporation) at the BL-4A experimental station of PLS, Pohang, Korea, at the absorption peak for selenium. The addition of 50 mM ferrous ascorbate solution to the cryoprotectant solution was not helpful in improving the resolution limit of the binary complex crystals of the SeMet-labeled Tpa1. Instead, the crystal was dehydrated by incubating for ~10 min in 10–20 µl of a cryoprotectant solution containing 40% (v/v) glycerol. The SeMet-substituted crystal belongs to the space group P321, with unit cell parameters of a = b = 136.3 Å, c = 83.28 Å, α = β = 90° and γ = 120°. One Tpa1 monomer is present in an asymmetric unit, giving a solvent content of 59.9%. A summary of the data collection statistics is given in Table 1. Selenium atoms were located with the program SOLVE (28). The phases were further improved by density modification using the program RESOLVE (29), yielding an interpretable electron density map. Phasing statistics are summarized in Table 1.
The SAD-phased electron density map was interpreted by the automatic model building program RESOLVE to build an initial polyalanine model, which accounted for ~30% of the residues. Subsequently, the missing residues were built and the chains were traced manually using the program COOT (30). The model of the SeMet-labeled Tpa1 was refined with the program REFMAC (31), including the bulk solvent correction. The refined model of the Fe(III)-bound Tpa1, accounting for 563 residues in one monomer of Tpa1, 199 water molecules and a Fe(III) ion in the asymmetric unit, gave Rwork and Rfree values of 19.0 and 24.7% for the 20.0–2.73 Å data, respectively (Table 1). Of the total 5% of the data were randomly set aside as the test data for the calculation of Rfree (32). The model of the SeMet-labeled Tpa1 was used to refine the structures of native Tpa1. The refined model of the Fe(III)-bound Tpa1, accounting for 557 residues in one monomer of Tpa1, 324 water molecules, a Fe(III) ion and four sulfate ions in the asymmetric unit, gave Rwork and Rfree values of 18.1 and 23.1% for the 20.0–2.50 Å data, respectively (Table 1). One of the sulfate ions is located at the crystallographic 2-fold symmetry axis. The model of the ternary complex was refined against 20.0–1.77 Å data to Rwork and Rfree values of 18.0 and 20.5%, respectively. It accounts for 558 residues in one monomer of Tpa1, 791 water molecules, a Fe(III) ion, one molecule of 2OG, four sulfate ions and one glycerol molecule in the asymmetric unit. The electron density map indicated that the side chains of 11 residues (His59, Ile87, Val90, Ser169, Ser183, Asn266, Ser339, Gln340, Ser343, Glu345, Cys494 and Asp592) have dual conformations. The refined models of both binary and ternary complexes have excellent stereochemistry (Table 1), as evaluated by the program PROCHECK (33).
Equilibrium sedimentation studies were performed using a Beckman ProteomeLab XL-A analytical ultracentrifuge in 20 mM Tris–HCl buffer (pH 7.5) containing 4.0 μM ferrous ascorbate, 4.0 μM sodium 2OG and 150 mM NaCl at 20°C. Tpa1Δ20 samples containing ferrous ascorbate and 2OG were measured at 235, 237 and 280 nm using a six-sector cell at three speeds (8000, 10 000 and 12 000 r.p.m.) and two different protein monomer concentrations (1.50 and 2.02 μM) with the loading volume of 130 μl. The Tpa1 CTD (residues 273–644) samples in 20 mM Tris–HCl buffer (pH 7.5) containing 200 mM NaCl were measured at 240 and 280 nm using a two-sector cell at 30 000 r.p.m. and two different monomer concentrations of 1.97 and 3.93 μM with the loading volume of 180 μl at 20°C. The concentrations of the Tpa1Δ20 and CTD proteins were calculated using ε280 nm = 98 905 M–1cm–1 and ε280 nm = 50 880 M–1cm–1, respectively, obtained from their amino acid compositions.
To assess the mRNA-binding ability of the purified Tpa1, a 20-mer poly(rA) was synthesized (Bioneer, Korea) and was 5′-labeled with [γ-32P]ATP by T4 polynucleotide kinase. The reaction buffer contained 15 mM sodium HEPES at pH 7.5, 100 mM potassium acetate, 0.2% bovine serum albumin, 20 unit of RNase inhibitor and 5.0 mM DTT. The Tpa1Δ20 proteins (final concentration of 5.0 μM) were added to the reaction mixture prior to the labeled poly(rA). Ferrous ascorbate (2.0 mM), 2OG (2.0 mM) or EDTA (10 mM) was also added to the reaction mixture. As a control, 15 μM bovine serum albumin was added instead of Tpa1Δ20. The mixture was incubated for 1 h at 37°C and the incubated mixture was resolved on a 5% (w/v) polyacrylamide gel. After electrophoresis at 4°C, the gel was exposed to an X-ray film.
Because overexpression of the full-length yeast Tpa1 (residues 1–644) was not successful, several truncated constructs were made. Although we could express the C-terminal truncated form of Tpa1 (residues 1–635) and both N- and C-terminal truncated form of Tpa1 (residues 21–635) in E. coli cells and crystallize them, the quality of these crystals was poor. Only the N-terminal truncated form of Tpa1 (residues 21–644) yielded well-diffracting crystals in the presence of ferrous ascorbate, a cofactor of Tpa1. The structure of Tpa1 was solved by SAD phasing (Table 1).
We have determined three crystal structures of S. cerevisiae Tpa1: (i) the binary complex of the native protein with Fe(III), (ii) the ternary complex of the native protein with both Fe(III) and 2OG and (iii) the binary complex of the SeMet-labeled protein. Refinement statistics are summarized in Table 1. An example of the electron density for the refined model of the ternary complex is shown in Figure 2A. Since the crystals were grown under aerobic conditions, the Tpa1-bound iron in the crystal likely exists in the inactive Fe(III) state, which can be produced by either direct attacking of molecular oxygen or an uncoupled reaction pathway (i.e. conversion of 2OG to CO2 and succinate with no corresponding substrate transformation) (20). In the ternary complex model, three internal regions of the polypeptide chain (Thr270–Ser276, Gly306–Ser327 and Asp561–Gly585) and terminal residues (Leu21–Glu23, Asp636–Ala644 and a C-terminal affinity tag LEHHHHHH) are disordered. In the binary complex model of the native Tpa1, Ser269 is also disordered. In the binary complex structure of the SeMet-labeled Tpa1, three internal regions of the polypeptide chain (Asp95–Leu99, Gly306–Ser327 and Asp561–Gly585) and terminal residues (Leu21, Glu637–Ala644 and a C-terminal affinity tag LEHHHHHH) are disordered. In this model, the linker (Ser269–Ser276) between NTD and CTD is ordered and forms an α-helix. The disordered internal regions are not well conserved among the Tpa1 family members (Figure 1). They correspond to or occur adjacent to long insertions in S. cerevisiae Tpa1. The root-mean-square (R.M.S.) deviation between the binary and ternary complexes of the native protein is 0.23 Å for 557 Cα atom pairs. The R.M.S. deviations >1.0 Å occur at Pro210, Thr268 and Asn520, with a maximum deviation of 1.37 Å for Thr268. This result indicates that the binding of 2OG does not accompany a large structural change. The R.M.S. deviations between the models of the SeMet-labeled Tpa1 and the binary (or ternary) complex of the native Tpa1 is 0.58 Å for 552 (or 553) Cα atom pairs.
The structure of Tpa1 bound to both Fe(III) and 2OG is shown in Figure 2C. S. cerevisiae Tpa1 is elongated with approximate dimensions of 90 Å × 55 Å × 45 Å. It comprises two domains of unequal size: the smaller NTD (residues 24–268) and the larger CTD (residues 277–635) (Figure 2C). Unexpectedly, both NTD and CTD adopt the same DSBH fold (24) in their cores (Figure 2B), although no sequence similarity is detected between them. The R.M.S. difference between NTD and CTD is 2.8 Å for 180 Cα atom pairs.
NTD contains 8 helices (composed of five α-helices and three 310-helices) and 11 β-strands (Figure 3A). Five antiparallel β-strands β1↑-β9↓-β6↑-β11↓-β4↑ form the larger sheet of the β-sandwich, while four antiparallel β-strands β8↓-β7↑-β10↓-β5↑ form the smaller sheet. The remaining two antiparallel β-strands β2↑-β3↓ close the opening of the two sheets, resembling a lid. A cluster of four helices (α1, α2, α3 and α4) cover the larger β-sheet of the β-sandwich. Eight β-strands—β9, β6, β11, β4, β8, β7, β10 and β5—constitute the DSBH fold of Tpa1 NTD. The DSBH topology may be considered as a special case of β-sandwiches and is built of eight β-strands that form the two four-stranded antiparallel β-sheets of the β-sandwich (34,35).
CTD contains 11 helices (composed of nine α-helices and two 310-helices) and 11 β-strands (Figure 3A). Five antiparallel β-strands β15↑-β22↓-β17↑-β20↓-β12↑ form the larger sheet of the β-sandwich, while four antiparallel β-strands β16↓-β21↑-β18↓-β19↑ form the smaller sheet. The remaining two antiparallel β-strands β13↑-β14↓ close the opening of the two sheets. Eight β-strands—β15, β22, β17, β20, β16, β21, β18 and β19—constitute the DSBH fold of Tpa1 CTD. A cluster of seven helices (α7, α8, α9, α10, α12, α13 and α14) cover the larger β-sheet of the β-sandwich and another helix α11 covers the smaller β-sheet. Two cysteine residues (Cys494 and Cys635) in CTD are close to each other. In the native Tpa1–Fe(III) binary complex, a disulfide bond is formed between Cys494 and Cys635, with a distance of 2.04 Å between the sulfur atoms. In the native Tpa1–Fe(III)–2OG ternary complex, the disulfide bond appears to be only partially formed, with a distance of either 2.03 or 3.75 Å between the sulfur atoms for the two conformations. In the SeMet-labeled Tpa1–Fe(III) binary complex, no disulfide bond exists between Cys494 and Cys635, with a distance of 4.65 Å between the sulfur atoms. This may be due to the presence of a reducing agent (TCEP) in the crystallization condition.
NTD and CTD of S. cerevisiae Tpa1 are arranged in a face-to-face fashion, with their β-sandwiches being nearly orthogonal to each other (Figure 2C). The smaller sheets of the NTD and CTD β-sandwiches face toward the domain interface. The solvent accessible surface area buried at the interface between the two domains in the ternary complex is ~1170 Å2 (~4.7% of the monomer surface area), with 42.6% of the atoms in this interface being polar (Protein–Protein Interaction Server at http://www.biochem.ucl.ac.uk/bsm/PP/server/). The surface representation of S. cerevisiae Tpa1 reveals a deep cleft (~40 Å long and ~20 Å wide), which runs roughly from the 2OG-binding site of NTD to the corresponding, unoccupied site of CTD (Figure 3B). The deep cleft is contributed by β2 and the following loop (Lys83, Asp86, Ile87 and Tyr88), β3 (Val90 and Gln92), β5 and the following loop (Leu156, His159, Ile163 and Arg166), the loop between α6 and α7 (Glu283 and Asp287), the loop between β13 and β14 (His410 and Lys411) and the loop between β16 and β17 (Leu513 and Thr515). This cleft is open at both ends. Each of the positively charged residues Lys83 and Lys411 is located near the end of this cleft (Figure 3B). Lys411 is conserved in both S. pombe Ofd1 and human Tpa1, while Lys83 is conserved in S. pombe Ofd1 only (Figure 1). This cleft may provide a binding site for an unknown main substrate of Tpa1. In both the ternary and binary complexes, four sulfate ions are bound to the positively charged surface on the ‘back’ side of the S. cerevisiae Tpa1 NTD (Figure 3B). Lys152, Lys182, Arg190 and Lys236 on this surface (Figure 3B) are conserved in S. pombe Ofd1 (Figure 1).
Although a monomer of S. cerevisiae Tpa1 exists in each asymmetric unit of the crystal, it forms a dimeric unit, with approximate dimensions of 140 Å × 70 Å × 55 Å, by association of two monomers related by a crystallographic 2-fold symmetry through their CTDs (Figure 3D). The solvent accessible surface area buried at the interface between the two monomers in this dimeric unit is ~2030 Å2 (~8.1% of the monomer surface area), with 31.5% of the atoms in this interface being polar (Protein–Protein Interaction Server at http://www.biochem.ucl.ac.uk/bsm/PP/server/). This finding raises the possibility that S. cerevisiae Tpa1 may exist as dimers in solution.
To assess the oligomeric state of Tpa1 in solution, we carried out sedimentation equilibrium ultracentrifugation analyses. Typical data obtained at 2.02 μM of Tpa1Δ20 at 280 nm and at two speeds of 10 000 and 12 000 r.p.m. are shown in Supplementary Figure S1. Supplementary Figure S1A also shows fits for different models of the Tpa1Δ20 oligomer; monomer (1x), dimer (2x) and monomer–dimer equilibrium (1x–2x). Ultracentrifugal data at two speeds (10 000 and 12 000 r.p.m.) were jointly fitted to the given models and analyzed. The R.M.S. errors for the 1x and 2x models are 1.38×10–2 and 1.23×10–2, respectively, which indicate the poor quality of the fits. In contrast to the homogeneous models, the R.M.S. error for the reversible 1x–2x model gave a much improved value of 5.88×10–3 indicating the superior quality of the fit with a dissociation constant (Kd) of 3.5×10–6 M. The reversibility of the monomer–dimer equilibrium was confirmed by multiple speed experiments (8000, 10 000 and 12 000 r.p.m.). These data clearly indicate that Tpa1 exists as a reversible monomer–dimer mixture in solution. Other data on Tpa1Δ20 lead to the same conclusion.
For Tpa1 CTD (residues 273–644), typical data obtained at two concentrations of 1.97 and 3.93 μM of Tpa1 CTD at 280 nm and at the speed of 30 000 r.p.m. are presented in Supplementary Figure S1B. The R.M.S. errors for the 1x and 2x models are 3.63×10–2 and 2.99×10–2, respectively, which indicate the poor quality of the fits. In contrast to the homogeneous models, the R.M.S. error for the reversible 1x–2x model gave a much improved value of 1.05×10–2 indicating the superior quality of the fit with a dissociation constant (Kd) of 1.3×10–5 M. Based on these results, we suggest that the dimerization of Tpa1, which is observed to be mediated through CTD in the crystal likely represents the mode of Tpa1 dimerization in solution.
To identify structurally similar proteins we carried out a search using the DALI server (36). The three best matches belong to the Fe(II)-/2OG-dependent dioxygenase family. They are (i) the human PHD2, the hypoxia-inducible factor (HIF) prolyl 4-hydroxylase (37) (PDB code 2G19; an R.M.S. deviation of 3.0 Å for 194 equivalent Cα positions in residues 32–266 of Tpa1, a Z-score of 18.0 and a sequence identity of 18%), (ii) the Chlamydomonas reinhardtii prolyl 4-hydroxylase (P4H) (38) (PDB code 2JIG; an R.M.S. deviation of 2.4 Å for 162 equivalent Cα positions in residues 28–249 of Tpa1, a Z-score of 14.2 and a sequence identity of 15%) and (iii) the human ABH3 protein, an oxidative DNA/RNA repair enzyme (demethylase) (39) (PDB code 2IUW; an R.M.S. deviation of 3.0 Å for 161 equivalent Cα positions in residues 37–248 of Tpa1, a Z-score of 12.8 and a sequence identity of 10%).
We further elaborated the structural similarity search with individual domains of S. cerevisiae Tpa1. Using the NTD (residues 24–269) alone, the result is similar to that obtained using the whole structure of Tpa1. The highest structural similarity is obtained with the human PHD2 (an R.M.S. deviation of 2.9 Å for 193 equivalent Cα positions in residues 32–268 of Tpa1, a Z-score of 18.0 and a sequence identity of 17%). The second highest similarity is found with the C. reinhardtii P4H (an R.M.S. deviation of 2.4 Å for 162 equivalent Cα positions in residues 28–268 of Tpa1, a Z-score of 14.0 and a sequence identity of 15%). Using CTD (residues 277–635) alone, the highest Z-score is obtained with the human PHD2 (an R.M.S. deviation of 3.1 Å for 183 equivalent Cα positions in residues 344–635 of Tpa1, a Z-score of 13.1 and a sequence identity of 6%) and the next highest similarity is found with the human ABH3 protein (an R.M.S. deviation of 3.3 Å for 157 equivalent Cα positions in residues 349–635 of Tpa1, a Z-score of 10.0 and a sequence identity of 11%).
The observed structural similarity of Tpa1 with human and algal P4Hs implies functional relatedness. Indeed, the Tpa1 NTD has a deep cleft that resembles the active sites of the human PHD2 and C. reinhardtii P4H (37,38; Figure 4A, the left panel). The putative active site cleft of S. cerevisiae Tpa1 NTD displays a high structural similarity with these P4Hs. It is lined with Ile87, Tyr88, Tyr150, Leu156, His159, Ile163, Arg166, Phe218, Val229, Trp244 and His246 (Figure 4A), many of which are hydrophobic. These residues are well conserved in other Tpa1 family members (marked using gray diamonds in Figure 1). Unlike the NTD of S. cerevisiae Tpa1, the human PHD2 and C. reinhardtii P4H, the substrate-binding pocket of the human ABH3 is considerably more polar (39). These findings suggest that S. cerevisiae Tpa1 could possibly function as a hydroxylase targeting a proline residue. When the Tpa1 NTD structure is superimposed with the (Ser-Pro)5 peptide-bound structure of the C. reinhardtii P4H (PDB code: 3GZE) (40), the (Ser-Pro)5 peptide is well fitted into the active site cleft of Tpa1 (Figure 4A, left panel). In the human PHD2 complexed with the C-terminal oxygen-dependent degradation domain of HIF-1α (41), Arg252, Tyr310, Arg322 and Trp389 surround the central proline (a hydroxylated target) of the peptide substrate (Figure 4B). In the peptide complex of C. reinhardtii P4H, Arg93, Tyr140, Arg161 and Trp243 surround the central proline of the peptide substrate (Figure 4B). Among them, Arg322/Trp389 of the human PHD2 and Arg161/Trp243 of C. reinhardtii P4H are strictly conserved as Arg166/Trp244 in S. cerevisiae Tpa1 (Figures 1 and and4B).4B). Arg252/Tyr310 of the human PHD2 and Arg93/Tyr140 of C. reinhardtii P4H are not conserved in S. cerevisiae Tpa1 and are replaced as Tyr88/Leu156.
When the sequences of the Tpa1 family members are aligned, we can recognize six sequence segments that are highly conserved (motifs I–VI in Supplementary Figure S2). Our structures show that motifs II, III and IV of the NTD of S. cerevisiae Tpa1 are involved in the binding of Fe(III) and 2OG. Motif II [L(L/M)xHDDx(I/L)xxRxIx(F/Y)ILYL] encompasses Leu156–Leu174 of S. cerevisiae Tpa1 (boxed in red in Supplementary Figure S2). Motif III [FFxVxPx1~2SFHxVxEV] covers Phe217–Val232 (boxed in green in Supplementary Figure S2), while motif IV [R(L/M)(S/A)xIxGW(Y/F)xxP] covers Arg238–Pro248 (boxed in pink in Supplementary Figure S2). ‘h’ is a hydrophobic residue, ‘x’ stands for any amino acid and the strictly conserved residues are in boldface. Despite the common DSBH fold for both NTD and CTD of S. cerevisiae Tpa1, Fe(III) and 2OG are bound to NTD only (Figure 2C). The NTD of S. cerevisiae Tpa1 possesses the highly conserved H1x(D/E)…H2 iron-binding motif (His159–Xxx–Asp161…His227) contributed by sequence motifs II and III (Supplementary Figure S2), and the 2OG-binding motif (Arg238–Xxx–Ser240) as part of motif IV (Figures 1 and and3C).3C). ‘Xxx’ stands for any amino acid. His159 and His227 of S. cerevisiae Tpa1 are located at the C-terminal end of the strand β5 and the N-terminal end of β10, respectively (Figure 1). The two ligand-binding motifs are also present in other Fe(II)-/2OG-dependent dioxygenases (20). Their absence in the CTD of Tpa1 suggests a role of CTD other than the dioxygenase-like function.
In the binary structure of the native Tpa1–Fe(III) complex, the active site metal ion is coordinated by the side chains of His159, Asp161 and His227 as well as two tightly bound and one loosely bound water molecules. In the ternary structure of the Tpa1–Fe(III)–2OG complex, the C-1 carboxylate and C-2 keto group of 2OG replace two water molecules that coordinate the Fe(III) ion (Figure 3C). The third water molecule, trans to His159, remains attached to the Fe(III) ion. The C-5 carboxylate of 2OG makes a salt bridge with Arg238 of motif IV (Figure 3C). In the current EXPASY database, Lys236 is suggested to be a potential binding residue of 2OG. This assignment needs to be revised in view of our structural information. The closest distance between the side chain NZ atom of Lys236 to 2OG is 18.7 Å. Hydrogen bonds are present between the hydroxyl group of Tyr173 (on β6) and the C-5 carboxylate of 2OG, as well as between the hydroxyl group of Ser240 in motif IV and the C-5 carboxylate of 2OG via a water molecule. Furthermore, the aliphatic portion of 2OG is sandwiched between the side chains of Ile171 (on β6) and Val229 (on β10).
On the basis of the 2OG binding mode, the iron-binding H1x(D/E)…H2 motif of the Fe(II)-/2OG-dependent dioxygenase family members has been grouped into two categories (20). The first category (‘in line’ binding mode) is represented by clavaminic acid synthase (CAS) (42), taurine dioxygenase (TauD) (43), factor inhibiting HIF (FIH) (44,45) and alkylsulfatase (AtsK) (46). In these enzymes, the side chain of His1 and the C-1 carboxylate of 2OG are approximately co-planar, with the C-1 carboxylate lying on the opposite side of His1 and the C-2 keto group lying on the opposite side of the acidic residue. In the second category (‘off line’ binding mode), represented by carbapenem synthase (CarC) (47), anthocyanidin synthase (ANS) (48) and AlkB (49), the C-1 carboxylate of 2OG is positioned on the opposite side of His2, and the C-2 keto group is located on the opposite side of the acidic residue. The active site of Tpa1 bears resemblance to the ‘off line’ binding mode, with the C-1 carboxylate of 2OG being positioned on the opposite side of His227 (His2) (Figure 3C).
Other conserved sequence motifs I, V and VI (Supplementary Figure S2) are not directly involved in the binding of iron and 2OG. Motif I [FxxKxxD(I/L)Y(R/K)hxQ(S/T)xDL] encompasses Phe80–Leu96 in the NTD of S. cerevisiae Tpa1 (boxed in blue in Figure 1 and Supplementary Figure S2). It covers the strands β2 and β3, and a loop C-terminal to β3 (Supplementary Figure S2). It is located near the active site in the NTD of S. cerevisiae Tpa1. This motif contributes to one side of the active site cleft (Supplementary Figure S3). The side chains of Ile87 and Tyr88 of this motif line the deep hydrophobic cleft in the active site. The hydroxyl group of Tyr88 points toward the bulk solvent, making hydrogen bonds with the conserved Lys83 and Gln92. Asp86 and Arg89 interact with the backbone of CTD residues Leu513 and Glu550, respectively. Except for Lys83 (on β2), well-conserved residues from motif I reside on the loop regions (α2–β2, β2–β3, β3–α3). Motif I is neither related to the core of the DSBH fold nor involved in the binding of iron and 2OG. In other P4Hs, the loop region corresponding to the sequence motif I of Tpa1 displays a large conformational change upon binding of the peptide substrate (40,41). If this analogy can be extended, motif I of Tpa1 may play a role in recognizing the protein substrate.
Motif V [R(R/H)(F/W)(R/K)xGx1~2(F/Y)TL] covers Arg503–Leu513 in the CTD of S. cerevisiae Tpa1 (boxed in brown in Figure 1 and Supplementary Figure S2), which resides on the strands β15 and β16, and the connecting loop between them. It is part of the DSBH fold of CTD. The tripeptide sequence Phe511–Thr512–Leu513 is located near β5 and the β2–β3 loop of NTD, providing an additional hydrophobic surface patch into the substrate-binding pocket of NTD (Supplementary Figure S3). Motif VI [LVhRDxxxLxFVK] covers Leu601–Lys613 in the CTD of S. cerevisiae Tpa1 (boxed in purple in Figure 1 and Supplementary Figure S2). It is also part of the DSBH fold of CTD and the conserved sequence Phe611–Val612–Lys613 is located adjacent to the Phe511–Thr512–Leu513 sequence of Motif V and the Asp86–Ile87–Tyr88–Arg89 sequence of Motif I. Lys613 of motif VI makes hydrogen bonds with the carbonyl groups of Thr85, Asp86 and Tyr88 of motif I in the NTD as well as the side chain of His155 of NTD.
Interestingly, the side chain of Val229 in the ternary complex structure of Tpa1 showed an extra electron density, which was interpreted as a hydroxylated valine (Figure 4D). In contrast, no such extra electron density was present in the binary complex structures of either native or the SeMet-labeled Tpa1. In view of the weak electron density, we assumed half occupancy of the hydroxyl group, resulting in a B-factor of 19.5 Å2 for the oxygen atom. Val229 is C-terminal to His2 (His227) of the iron-binding H1x(D/E)…H2 motif. We confirmed by DNA sequencing that the codon corresponding to Val229 was not mutated. In the ternary complex structure, the oxygen atom of the hydroxylated Val229 is located 9.5 Å away from the iron center (Figure 4D). The modification of Tpa1 Val229 may have resulted from an attack by a hydroxyl radical formed by the uncoupled reaction. The uncoupled reaction is often stimulated by inhibitors or poor substrates, may also occur in the absence of a primary substrate and can result in aberrant self-hydroxylation reactions involving the protein side chains (20). Such irreversible protein modifications have been observed in other 2OG-dependent hydroxylases such as TfdA, AlkB, TauD and ABH3 (39,50−52). The hydroxylated residues in TfdA, AlkB, TauD and ABH3 are Trp113, Trp178, Trp128/Trp240/Trp248 and Leu177, respectively, and the distance between the oxygen atom of the hydroxyl group and the iron center is 10.5, 4.7, 6.0/7.5/10.5 and 6.8 Å, respectively. Similarly, hydroxylated Val229 of Tpa1 is not directly bound to the iron ion. Instead, its hydroxyl group forms a hydrogen bond with Arg238 that stabilizes the C-5 carboxylate of 2OG. Observation of self-hydroxylation in S. cerevisiae Tpa1 suggests that it could catalyze post-translational modification of other components in the mRNP complex.
The report that Tpa1 plays a role as a component of the mRNP complex (13) raises the possibility that Tpa1 interacts directly with mRNA. To determine whether Tpa1 binds to mRNA, we performed gel-mobility shift assays using a radiolabeled 20-base poly(rA) and Tpa1Δ20. We found that the RNA band was shifted in the presence of Tpa1Δ20 complexed with Fe(II) and 2OG in a protein dose-dependent manner (Figure 4C, lanes 5−7). As a control, bovine serum albumin did not bind the same RNA in the presence of Fe(II) and 2OG (Figure 4C, lane 2). This result clearly indicates that the shift of the RNA band is due to the binding of poly(rA) by Tpa1. When residual metal ions bound to Tpa1Δ20 were removed by excess EDTA, the binding ability of Tpa1Δ20 was much reduced (Figure 4C, lane 3). The maximum poly(rA) binding by Tpa1Δ20 was observed when 2OG was added in addition to Fe(II) (Figure 4C, lanes 4 and 5). To assess the role of NTD and CTD of Tpa1 in RNA binding, we attempted to express each of them but we could express only CTD (residues 273−644). Tpa1 CTD also retarded the electrophoretic mobility of poly(rA) in a dose-dependent manner (Figure 4C, lanes 8−10). However, CTD showed a tendency to form high molecular weight aggregates upon binding poly(rA). This may have been caused by the lower solubility of the recombinant CTD compared to Tpa1Δ20. The structure of S. cerevisiae Tpa1 suggests that the binding of poly(rA) might occur on the positively charged surface patch on the ‘back’ side of Tpa1 opposite from the catalytic site (Figure 3B). This surface is lined with many positively charged residues (Lys152, Arg179, Lys180, Lys182, Arg190, Lys207, Lys233, Lys236, Lys354, Lys380, Lys384, Arg413 and Lys620). Four sulfate ions are bound to this surface in both the binary and ternary structures of the native S. cerevisiae Tpa1, possibly mimicking the phosphate backbone of poly(rA) (Figure 3B).
Our present study provides the structural details of S. cerevisiae Tpa1 and it supports the previous assignment of Tpa1 as a member of the Fe(II)-/2OG-dependent dioxygenase family on the basis of the primary sequence. The structural information on S. cerevisiae Tpa1 has significant biological implications. This is because the Tpa1 family members were shown to be required for the control of translation termination, mRNA poly(A) tail length and mRNA stability as an essential component of the mRNP complex in S. cerevisiae and for oxygen sensing in the transcriptional regulation of gene expressions in S. pombe. Members of the Tpa1 family are also present in higher eukaryotes including humans and they are likely to play important biological roles. Our crystal structure of S. cerevisiae Tpa1 is the first structure of the Tpa1 family members, representing an important step toward understanding the biological functions of this protein family. It shows that two domains (NTD and CTD) of Tpa1 have the common DSBH fold (Figure 2B). However, the binding of Fe(III) and 2OG via the iron-binding motif—H1X(D/E)…H2—and the Arg–Xxx–Ser sequence motif is observed only in the NTD of S. cerevisiae Tpa1 (Figure 3C). It suggests that the NTD of Tpa1 family members could function as a P4H-like enzyme or as a sensor of molecular oxygen.
Ofd1 of S. pombe, a member of the Tpa1 family, is an uncharacterized P4H-like Fe(II)/2OG-dependent dioxygenase. It was reported that its C-terminal degradation domain accelerates the degradation of Sre1N in the presence of oxygen (19). Under low oxygen conditions, the sterol regulatory element binding protein (Sre1), an endoplasmic reticulum membrane-bound transcription factor, is proteolytically cleaved and the released N-terminal transcription factor (Sre1N) activates gene expression essential for hypoxic growth (19). It was shown that the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and the Ofd1 CTD accelerates Sre1N degradation (19). These data support a model whereby the Ofd1 N-terminal dioxygenase domain is an oxygen sensor that regulates the activity of the C-terminal degradation domain (19). Our structure of S. cerevisiae Tpa1, together with the conserved sequence features, indicates that the NTD of S. pombe Ofd1 has the potential to function as an oxygen sensor. Unlike the fission yeast, however, S. cerevisiae does not contain an Sre1 pathway (19). The role of S. cerevisiae Tpa1 as part of the mRNP complex likely depends on the Fe(II)-/2OG-dependent dioxygenase-type hydroxylase domain in its N-terminal half. It may control mRNA stability indirectly via post-translational modification of other components of the mRNP complex. We also demonstrated through an RNA binding assay (Figure 4C) that the S. cerevisiae Tpa1 can bind to poly(rA), suggesting that it could interact directly with mRNA in the mRNP complex. In S. pombe, negative regulator of Ofd1 (Nro1) was found to bind to the Ofd1 CTD and to inhibit Sre1N degradation under low oxygen (22). The counterpart of Nro1 in S. cerevisiae has been detected by global affinity purification studies (14,22). The role of the CTD of S. cerevisiae Tpa1 has yet to be established; this study, together with the lack of the ligand-binding motifs, supports a non-enzymatic role.
The coordinates and structure factors for yeast Tpa1 have been deposited in the Protein Data Bank under accession numbers 3KT1, 3KT7 and 3KT4 for the binary and ternary complexes of the native Tpa1 and the binary complex of SeMet-substituted Tpa1, respectively.
Supplementary Data are available at NAR Online.
Basic Science Outstanding Scholars Program, World-Class University Program (grant no. 305-20080089) and Basic Research Program Grant (R01-2006-000-10311-0) of the National Research Foundation of Korea, Korea Ministry of Education, Science and Technology. Funding for open access charge: Basic Science Outstanding Scholars Program and World-Class University Program (grant no. 305-20080089) of the National Research Foundation of Korea, Korea Ministry of Education, Science and Technology.
The authors thank beamline staffs for assistance during X-ray data collection at Photon Factory (beamlines BL-5A, BL-17A and NW12A) and Pohang Light Source (beamlines BL-4A and BL-6C). We also thank Dr V. Narry Kim for providing laboratory space and equipment for electrophoretic mobility shift assay.