The existence of a new putative RNA-binding domain, abbreviated THUMP, was deduced from sequence comparison of various proteins known or predicted to be involved in RNA metabolism (23
). For a long time, the THUMP domain remained essentially uncharacterized. Recently, new experimental data have been collected on three different THUMP-containing enzymes: ThiI from E.coli
, Tan1 from S.cerevisiae
and TrMet(m2,2G10) from P.abyssi
. All three proteins are involved in tRNA modification and target nucleosides in the central 3D-core of the tRNA molecule. Thus, we proposed that THUMP may interact with a specific region of tRNA and target the catalytic domains of these various enzymes towards the common region of the substrate (9
). Recently, two structures of ThiI family members have been solved: PH1313 from P.horikoshii
(without any published analysis) and BA4899 from B.anthracis
strain Ames (28
). However, none of these proteins have been biochemically characterized and actually their activity remains putative [PH1313 is even predicted to be catalytically inactive, (28
)]. Thus, the experimental validation that THUMP interacts with tRNA and can be called a RNA-binding domain was needed.
In this paper, we first defined by limited proteolysis and mass spectrometry the boundaries of domains in PAB1283, a prototype member of the TrMet(m2,2G10) family. We then purified the N-terminal region [1–155] of PAB1283, which contains the THUMP domain, as a standalone protein. This autonomously folding unit was characterized as a soluble monomeric protein showing only a very weak affinity for tRNA in contrast to the entire PAB1283 protein.
In order to gain insight into the structure of this domain and understand how it functions concomitantly with the C-terminal catalytic domain, we proposed an original modeling approach. It is now well established that in silico
modeling can be a fast approach to predict the structure of a protein or a macromolecular complex. However, theoretical methods generate models that need experimental validation, and also the number and diversity of the proposed solutions is often considerable. A promising strategy is the use of experimental data for model discrimination or refinement (50
). Thus, we combined the experimental and theoretical analyses at three stages, corresponding to characterization of the primary structure (domain boundaries in the PAB1283 sequence), tertiary structure (3D fold of the THUMPα module), and quaternary structure (interactions between the two protein domains and the tRNA). First, it is satisfying to find that the experimentally determined boundaries of both domains from PAB1283 parallel those predicted by our FR analysis. Second, to distinguish among various threading models of THUMPα, we determined which residues are solvent accessible with a set of three labeling reagents and identified a disulfide bridge between the two cysteines found in the THUMPα sequence. These constraints allowed us to validate the hypothesis that the N-terminal structural module of PAB1283 is related to the equivalent module observed in the structure of ThiI (28
), and that both proteins share not only the easily detectable THUMP domain, but also the strongly diverged NFLD domain (). Therefore the role of the NFLD domain may not be specific to ThiI, as previously suggested (28
). Our model shows that THUMPα by itself does not present any particular concentration of positively charged residues, although its conserved residues map on the surface corresponding to the ThiI homolog (28
). These results suggest that the THUMP domain is not necessarily responsible per se
for the affinity of TrMet(m2,2G10) for tRNA but rather may be used to target the catalytic domain to a particular region of the tRNA structure. The results of our experiments show that the linker between THUMPα and the catalytic domain of PAB1283 is flexible. As it is highly basic, it may participate in nucleic acid binding.
Finally, we constructed a docking model of the two domains of PAB1283 onto the tRNA substrate ( and Supplementary Data), based on experimental restraints from interactions between different elements of both molecules. This docking model confidently places the catalytic MTase domain of PAB1283 near the G10 nucleoside, at the junction between the anticodon stem and the D-stem, and the THUMPα domain at the co-axially stacked helices of the T-arm and the acceptor stem. Most of the interactions between the THUMPα domain and the tRNA occur through the phosphate backbone, suggesting that it is the RNA structure rather than the sequence that is recognized by THUMPα. Recently, it was found that the T-arm is an essential specificity determinant for PAB1283 (21
). Likewise, the minimal substrate for the tRNA:s4
U8 synthase ThiI from E.coli
is an RNA mini-helix comprising the acceptor and T-stems (26
). Thus, the specificity determinants of E.coli
ThiI and TrMet(m2,2G10) appear to be strikingly similar, despite the fact that these enzymes use unrelated catalytic domains to carry out completely different reactions. If the C-terminal domain is removed from our docking model of PAB1283, the site of U8 is relatively exposed, so that the catalytic domain of the thiouridine synthase could be modeled to bind tRNA in a similar manner to that predicted for the MTase domain of PAB1283. In order that both enzymes carry out their respective reactions, some conformational rearrangement, such as base flipping (52
) in the substrate tRNA must occur, to expose the target nucleoside and place it in the active site. Likewise, the Tan1 protein required for N4
-acetylcytidine formation at position 12 appears to contain not only the THUMP domain, but also the N-terminal extension that exhibits a similar structure to the NFLD domain (J. M. Bujnicki, unpublished data). Thus, we postulate that ThiI thiolase, TrMet(m2,2G10) and the enzymes responsible for N4
-acetylcytidine formation at position 12 possess a common THUMPα module [comprising the NFLD and THUMP (sub)domains] that should bind tRNA in a very similar manner.
In conclusion, our docking model provides a useful platform for future studies and can be tested experimentally. In particular, footprinting and cross-linking analyses could provide specific restraints for inter-residue distances that could be used to refine the current model. THUMP-domain containing protein Tan1 from S.cerevisiae
is required for the formation of N4-acetylcytidine at position 12 (27
), a nucleoside, which is located just between G10 and U8. It is of interest to determine what could be the minimal substrate for this enzyme. Interestingly, it remains to be determined if other tRNA-modifying enzymes that contain the THUMP domain interact with the substrate in a similar way as TrMet(m2,2G10), i.e. structured tRNA as target and the co-axially stacked helices of the T-arm and the acceptor stem as unmodified support. Such hypothesis may be the basis of original strategies to search for the function of these still uncharacterized THUMP-containing proteins.