Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Proteins. Author manuscript; available in PMC 2009 December 11.
Published in final edited form as:
PMCID: PMC2792022

Deep Trefoil Knot Implicated in RNA Binding Found in an Archaebacterial Protein


(βα)8 (TIM) barrel comprises one of the most abundant and versatile fold in nature.1 Although its structure is conserved, its primary sequence is highly divergent and gives rise to a plethora of distinct functions. Most structures deposited into the Protein Data Bank consist of the TIM barrel fold,1 and TIM barrel appears to be the most common fold in yeast.2 Nearly all TIM barrel proteins are enzymes, and so far, 15 distinct enzymatic functions have been assigned to TIM barrel containing proteins.3 (The exception, narbonin, has no known function.4) Most commonly, their active sites are positioned within the βα loops at the C-terminal end of the protein.5 TIM barrels are differentiated by auxiliary features such as location (and nature) of additional domains, identity of cofactor, number of (βα) units, location of the barrel major axis,5 and are based on strand and shear number.6 In general, in all TIM barrels and most other protein folds the main-chain does not cross over (to form a knot), although protein topologies involving formation of a knot have been reported in proteins.79

The M. thermoautotrophicum MT1 gene is conserved in archaea, it lies in a ribosomal protein operon,10 and it codes for a 268 amino acid protein of unknown function. We report here the structure of MT1 that is novel from several standpoints: (i) the structure contains a novel topological unit—a deep C-terminal trefoil knot first observed in a TIM barrel-like fold, archaebacterial proteins and rarely observed in other proteins79; (ii) structurally, it contains only five (βα) units, and the arrangements of its hydrophobic and hydrophilic surfaces are opposite to that found in classical TIM barrel proteins; (iii) functionally, although it lacks typical features found in enzymes of the barrel family, it has strongly conserved residues clustered on the surface that form a potential catalytic site; (iv) the structure provides a first example of barrel-like fold linked to an RNA-binding domain, suggesting an extension of TIM barrel functionality to nucleic acid binding and/or catalysis.


The MT1-coding region was cloned into pET-15b (Novagen) as a fusion with His6 affinity tag and thrombin cleavage site. The protein was expressed in E. coli BL21 (DE3) containing plasmid that overexpresses rare E. coli tRNAs. Protein was purified by metal affinity chromatography on Ni-NTA superflow resin (Qiagen). The Se-Met derivative was produced as described previously.11

Crystallizations were performed by using the Screens I and II (Hampton Research). The best crystals were grown from a solution containing 5.7 mg/mL protein, 15% v/v Jeffamine M-600, 50 mM MES (pH 6.5), 25 mM CsCl in a drop equilibrated against 30% v/v Jeffamine M-600, 100 mM MES (pH 6.5), 50 mM CsCl at 23°C. The monoclinic C2 crystals (a = 101.307 Å, b = 51.353 Å, c = 109.25 Å, β = 94.69°) contain two molecules in the asymmetric unit related by a twofold noncrystallographic symmetry axis. A three wavelengths MAD data set was collected to 2.9 Å, and native data were collected to 2.3 Å at 100 K at SBC 19-ID beamline at the Advanced Photon Source of the Argonne National Laboratory by using a 3 × 3 mosaic CCD detector. All data were analyzed, indexed, and scaled by using HKL2000 (Table I).12

Data Collection Parameters

The selenium substructure was solved by using the program SnB13 and refined with the program SHARP.14 Solvent flattening, histogram mapping, and twofold noncrystallographic symmetry averaging followed by electron density map calculation were performed by using the CCP4 suite15 (see references) and MAPMAN (Table II).16 The model was built manually by using O17 and refined against 2.3 Å native data using CNS.18 The final model has an Rwork = 22.1% and Rfree = 27.7%, and 120 water molecules. The Ramachandran plot calculated with the program PROCHECK (see CCP4 reference) shows all residues have favorable Φ and Ψ angles (Table III). The electron density maps calculated from the final model refined at 2.3 Å were clear for all main-chain atoms except for the last six and the last four residues in the A and B monomers, respectively. The maps do not show density for residues N-terminal to the initiating methionine. The coordinates have been deposited in the Protein Data Bank with accession code 1K3R. Tertiary structure alignments were performed by using the DALI web server.19

Phasing Power Statistics
Model Refinement and Quality


The sequence homologues of MT1 are found only in archaea and eukaryotes (Fig. 1) and show no significant sequence similarity with other proteins of known structure. The MT1 monomer consists of a large dimerization domain (MT1-DD) and a small β-barrel auxiliary domain (MT1-CSD). Two monomers dimerize via the helical face of their MT1-DDs, burying 3740 Å2 (28%) of accessible surface area from each monomer (Fig. 2). The MT1-CSD is inserted in the second (αβ) loop region of the dimerization domain, and the two are connected by an α-helix and a 310 helix (Fig. 2). The β-barrel and a TIM barrel dimerization domains create a continuous 95 Å long positively charged surface for possible interaction with nucleic acid. Several strictly conserved residues are found on the dimerization as well as on MT1-DD/CSD interfaces.

Fig. 1
Sequence alignment of MT1 with proteins from a representative set of organisms using the program CLUSTALW.31 Completely conserved residues are highlighted in red; other conserved residues are highlighted in blue. Secondary structural elements are based ...
Fig. 2
Overall structure of MT1 dimer. Stereoview is along non-crystallographic two-fold axis. Each monomer is separately colored and the knot region is marked in both monomers. The loop is dark blue and the C-terminal sequence threaded through the loop is red. ...

The MT1-DD consists of 197 amino acids (residues 1–92 and 160–264) (Fig. 1). Analysis of structural homologs using the program DALI19 showed that the MT1-DD shares remarkable structural homology with TIM barrel enzymes, including E. coli methylenetetrahydrofolate reductase (MTHFR) (PDB acc. no. 1B5T, Z score: 5.7, RMSD: 3.1 Å over 119 equivalenced residues), M. kandleri coenzyme F420-dependent tetrahydromethanopterin reductase (PDB acc. no. 1EZW, Z score: 5.4, RMSD: 3.8 Å over 124 equivalenced residues), and rabbit muscle pyruvate kinase (PDB acc. no. 1A49, Z score: 4.6, RMSD: 3.4 Å over 117 equivalenced residues). Despite missing half of the structural elements of a TIM barrel, the MT1-DD five strands and four helices superpose with (α/β)8 arrangement of the MTHFR TIM barrel with an RMSD of 3.1 Å (β-strands 1, 2, 3, 7, and 8 and α-helices 1,2, 7, and 8 are present in MT1-DD, using the numbering convention from MTHFR [Fig. 3]. Helix 3's from MT1-DD and MTHFR do not overlap structurally because the MT1-DD α-helix is dragged out of position to connect with the β-strand 7 (Fig. 4). Hence, the 4th, 5th, and 6th (βα) units of MTHFR are missing in MT1-DD. Nevertheless, the MT1 dimer does not reconstitute complete TIM barrel structure. The existence of the MT1-DD stable partial TIM barrel strongly supports the notion that the ancestral TIM barrel was a half-barrel20 and suggests that the MT1-DD may be an early prototype of a TIM barrel. The β2 strand from the MT1 homolog from S. pombe is flanked by a glycine and proline with a spacing and internal β-strand sequence consistent with other TIM barrels21 (Fig. 1). The MT1 structure appears also like a classical nucleotide binding Rossmann fold.22 However, MT1-DD has no conserved nucleotide binding motifs.

Fig. 3
Secondary structure and topology map of MT1 showing connectivity β-strands and α-helices. All secondary structural elements that are part of the TIM barrel are numbered according to the corresponding elements of MTHFR (see text). The remaining ...
Fig. 4
Cα trace and stereo view of MT1-DD and MT1-CSD structural overlap. (a) The MT1-DD (red) is shown overlapped with the corresponding domain of MTHFR (green). Corresponding α-helices and β-strands are labeled as α and β, ...

At its C-terminus, the dimerization domain contains a knot. The 35 C-terminal residues are threaded through a loop on a surface that connects α-helix 7 with β-strand 7 (Figs. 2, ,3,3, and and5).5). This is the first time that this architecture has been observed in a barrel-like structure. The knot region comprises a loop with a short β-strand that connects β-strand 8 with C-terminal α-helix 8. The crossover involves residues Val233, Asn234, Ala193, and Ser194 (Figs. 1 and and5).5). The sequence that is threaded through the loop is not conserved with the exception of Asp230 and Pro237 on the C-side of the knot. There is also conserved proline residue within the loop (Pro195) that is located ~5 Å from the Pro237. Several hydrophobic residues are scattered throughout the sequence, and the length of the region varies among homologs (24–37 residues). The knot conformation is stabilized on N-site by Trp232, which appears to act as an anchor and on C-site by H-bond between Arg191 and Glu239. This region of MT1-DD is involved in dimer formation. Threading 35 residues through the loop region requires a major structural rearrangement (or cleavage and religation) of the protein main-chain.

Fig. 5
The MT1 trefoil knot. Residues 1–190 and 199–229 are shown in solvent accessible surface representation (1.4 Å radius). The knot loop (residues 191–198) is in blue and the polypeptide chain threaded through the loop (residues ...

Only a few other proteins have a knotted fold.79 It is interesting that a very similar knot structure has been reported recently for the RrmA protein catalytic domain from Thermus thermophilus, which is predicted to be a 2′-O-ribose methyltransferase.8 RrmA shares strong structural homology with MT1-DD including the knot region (PDB acc. no. 1IPA, Z score: 12.7, RMSD: 2.4 Å over 129 equivalenced residues). These proteins also show strikingly similar design. In MT1, a half-TIM barrel is fused with a putative cold-shock RNA-binding domain. In RrmA, the three-layer sandwich is fused with a eukaryotic ribosomal protein L30.8 Despite the structural and perhaps functional similarities, MT1 and RrmA share virtually no sequence similarity, and the proposed catalytic residues in RrmA are not conserved in MT1 family. Present data suggest that the machinery responsible for creating the knot structure is present in bacteria, archaea, and eukaryotes.

The charge properties of the surface of the MT1-DD, with a polar interior and a hydrophobic exterior, are unlike those found in TIM barrels. Within the barrel, MT1-DD has an atypical abundance of charged and polar residues that point into the center of the barrel. The polar nature is further strengthened by the third α3 helix and the loop linking strands β8 and β9, which shield several hydrophobic residues (data not shown). In contrast, its α-helical face is unusually hydrophobic and drives dimerization.

MT1 has an auxiliary domain inserted into a loop of the TIM barrel [Figs. 1 and 4(b)]. The auxiliary domain is 67 amino acids long and contains residues 93–159 in the MT1 protein [Figs. 1 and and2].2]. The program DALI shows the this domain shares significant 3D structural similarity with three bacterial proteins: E. coli major cold-shock protein CspA (PDB acc. no. 1MJC, Z score 5.6; RMSD: 2.5 Å over 58 equivalenced residues), E. coli polyribonucleotide nucleotidyl transferase-S1 RNA-binding domain (PDB acc. no. 1SRO, Z score: 5.0; RMSD 2.3 Å over 52 equivalenced residues), Thermus thermophilus S17 protein (PDB acc. no. 1FJF_Q, Z score 5.1, RMSD: 2.6 Å over 54 equivalenced residues). All four proteins are five-stranded antiparallel β-barrels that share the identical topology, known as the cold-shock domain23 [Fig. 4(b)]. This type of structure is also classified as an oligonucleotide-binding (OB) fold. Preliminary data suggest that MT1 is not a cold-shock protein (Giometti and Tollakesen, personal communication). The electrostatic potential surface map drawn by using GRASP24 also indicates a potential nucleic acid-binding role. MT1 has a distinct positively charged face (data not shown). Although the C-terminal β-sheet residues of the barrels contribute a small portion of the positive charge, most of the charge is located in the MT1-CSDs. This charge distribution is similar to other nucleic acid-binding proteins with this fold.

It is likely that the MT1 binds single- or double-stranded RNA for the following reasons. First, MT1-CSD has significant structural similarity with CspA, which is an RNA chaperone that binds RNA to prevent hairpin formation for transcription antitermination,25,26 and with the ribosomal protein S17, which binds double-stranded regions of the 16s rRNA. Second, the gene for MT1 is located within a cluster of genes related to ribosomal function. However, we cannot rule out the possibility that MT1 binds DNA because the OB fold is found in many ssDNA-binding proteins,27 including the product of BRCA2 oncogene that contains three such units and binds single-stranded DNA.28

The MT1 structure, particularly the inserted domain, provides additional support to the observation that organisms accomplish complex tasks using modular protein design from a limited number of modules. There are at least two other examples of TIM barrel proteins that contain a similar insertion. Rabbit muscle pyruvate kinase contains a 9-stranded barrel inserted after the third βα loop. In that protein, the β-barrel domain forms part of the active site with another structurally unrelated domain.29 The Bacillus cereus β-amylase TIM barrel contains a seven-stranded barrel, which forms the maltose-binding site and inserted at the C-terminus of the protein.30 In both instances, the inserted β-barrel domains are important for enzymatic catalysis.


We thank Dr. Alexey Murzin for pointing out to us the importance of knot structure and discussion, Sandra Tollaksen and Dr. Carol Giometti for performing two-dimensional gel electrophoresis experiments, and Lindy Keller for assistance in preparation of the manuscript. We also thank the staff of the Structural Biology Center for their support.

Grant sponsor: National Institutes of Health; Grant number: GM62414; Grant sponsor: U.S. Department of Energy, Office of Biological and Environmental Research; Grant number: W-31-109-Eng-38; Grant sponsor: Ontario Research and Development Challenge Fund.

The submitted manuscript has been created by the University of Chicago as Operator of Argonne National Laboratory (“Argonne”) under Contract No. W-31-109-ENG-38 with the U.S. Department of Energy. The U.S. Government retains for itself, and others acting on its behalf, a paid-up, nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.


1. Wierenga RK. The TIM-barrel fold: a versatile framework for efficient enzymes. FEBS Lett. 2001;492:193–198. [PubMed]
2. Jansen R, Gerstein M. Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins. Nucleic Acids Res. 2000;28:1481–1488. [PMC free article] [PubMed]
3. Hegyi H, Gerstein M. The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J Mol Biol. 1999;288:147–164. [PubMed]
4. Hennig M, Schlesier B, Dauter Z, Pfeffer S, Betzel C, Höhne WE, Wilson KS. A TIM barrel protein without enzymatic activity? Crystal-structure of narbonin at 1.8 Å resolution. FEBS Lett. 1992;306:80–84. [PubMed]
5. Farber GK, Petsko GA. The evolution of α/β enzymes. Trends Biochem Sci. 1990;15:228–234. [PubMed]
6. Nagano N, Hutchinson EG, Thornton JM. Barrel structures in proteins: automatic identification and classification including a sequence analysis of TIM barrels. Protein Sci. 1999;8:2072–2084. [PubMed]
7. Taylor WR. A deeply knotted protein structure and how it might fold. Nature. 2000;406:916–919. [PubMed]
8. Nureki O, Shirouzu M, Hashimoto K, Ishitani R, Terada T, Tamakoshi M, Oshima T, Chijimatsu M, Takio K, Vassylyev DG, Shibata T, Inoue Y, Kuramitsu S, Yokoyama S. An enzyme with a deep trefoil knot for the active-site architecture. Acta Crystallogr D Biol Crystallogr. 2002;58:1129–1137. [PubMed]
9. Takusagawa F, Kamitori K. A real knot in protein. J Am Chem Soc. 1996;118:8945–8946.
10. Arndt E, Kromer W, Hatakeyama T. Organization and nucleotide sequence of a gene cluster coding for eight ribosomal proteins in the archaebacterium Halobacterium marismortui. J Biol Chem. 1990;265:3034–3039. [PubMed]
11. Walsh MA, Dementieva I, Evans G, Sanishvili R, Joachimiak A. Taking MAD to the extreme: ultrafast protein structure determination. Acta Crystallogr. 1999;D55:1168–1173. [PubMed]
12. Otwinowski Z, Minor W. Processing of x-ray diffraction data collected in oscillation mode. Methods Enzymol. 1997;276:307–326.
13. Miller R, Gallo SM, Khalak HG, Weeks CM. SnB: crystal structure determination via Shake-and-Bake. J Appl Crystallogr. 1994;27:613–621.
14. de la Fortelle E, Bricogne G. Maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. Methods Enzymol. 1997;276:472–494.
15. Number 4 Collaborative Computational Project The CCP4 Suite: programs for protein crystallography. Acta Crystallogr. 1994;D50:760–763. [PubMed]
16. Kleywegt GJ, Jones TA. xdl MAPMAN and xdl DATAMAN—programs for reformatting, analysis, and manipulation of biomacromolecular electron-density maps and reflection data sets. Acta Crystallogr. 1996;D52:826–828. [PubMed]
17. Jones TA, Kjeldgaard M. Electron-density map interpretation. Methods Enzymol. 1997;277:173–207. [PubMed]
18. Brünger AT. CNS, Crystallography and NMR System. Version 0.5. Yale University; New Haven, CT: 1998.
19. Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993;233:123–128. [PubMed]
20. Hocker B, Deismann-Driemeyer S, Hettwer S, Lustig A, Sterner R. Dissection of a (βα)8-barrel enzyme into two folded halves. Nat Struct Biol. 2001;8:32–36. [PubMed]
21. Janecek S. Invariant glycines and prolines flanking in loops the strand β2 of various (αñ β)8-barrel enzymes: a hidden homology? Protein Sci. 1996;5:1136–1143. [PubMed]
22. Song H, Parsons MR, Rowsell S, Leonard G, Phillips SEV. Crystal structure of intact elongation factor EF-Tu from Escherichia coli in GDP conformation at 2.05 Å resolution. J Mol Biol. 1999;285:1245–1256. [PubMed]
23. Schindelin H, Jiang W, Inouye M, Heinemann U. Crystal structure of CspA, the major cold shock protein of Escherichia coli. Proc Natl Acad Sci USA. 1994;91:5119–5123. [PubMed]
24. Nicholls A, Sharp KA, Honig B. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins. 1991;11:281–296. [PubMed]
25. Jiang W, Hou Y, Inouye M. CspA, the major cold-shock protein of Escherichia coli, is an RNA chaperone. J Biol Chem. 1997;272:196–202. [PubMed]
26. Bae W, Xia B, Inouye M, Severinov K. Escherichia coli CspA-family RNA chaperones are transcription antiterminators. Proc Natl Acad Sci USA. 2000;97:7784–7789. [PubMed]
27. Bycroft M, Hubbard TJP, Proctor M, Freund SMV, Murzin AG. The solution structure of the S1 RNA binding domain: a member of an ancient nucleic acid-binding fold. Cell. 1997;88:235–242. [PubMed]
28. Yang H, Jeffrey PD, Miller J, Kinnucan E, Sun Y, Thomä NH, Zheng N, Chen PL, Lee WH, Pavletich NP. BRCA2 function in DNA binding and recombination from a BRCA2-DSS1-ssDNA structure. Science. 2002;297:1837–1848. [PubMed]
29. Larsen TM, Laughlin T, Holden HM, Rayment I, Reed GH. Structure of rabbit muscle pyruvate kinase complexed with Mn2+, K+, and pyruvate. Biochemistry. 1994;33:6301–6309. [PubMed]
30. Mikami B, Adachi M, Kage T, Sarikaya E, Nanmori T, Shinke R, Utsumi S. Structure of raw starch-digesting Bacillus cereus beta-amylase complexed with maltose. Biochemistry. 1999;38:7050–7061. [PubMed]
31. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
32. Kraulis JP. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr. 1991;24:946–950.